{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 泰坦尼克号灾难生存分析\n",
    "----\n",
    "[项目介绍](https://www.kaggle.com/c/titanic)\n",
    "\n",
    "* 目标确定：根据已有的数据预支未知旅客的生死\n",
    "* 数据准备：\n",
    "    * 数据获取，载入训练集CSV、测试集CSV\n",
    "    * 数据清洗，补齐或者抛弃缺失值，数据类型变换\n",
    "    * 数据重构，根据需要重新构造数据（重组数据，构建新特征）\n",
    "* 数据分析：\n",
    "    * 描述性分析，画图，直观分析\n",
    "    * 探索性分析，机器学习模型\n",
    "* 成果输出： CSV文件上传得到正确率和排名"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据载入"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>3</td>\n",
       "      <td>Kelly, Mr. James</td>\n",
       "      <td>male</td>\n",
       "      <td>34.5</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>330911</td>\n",
       "      <td>7.8292</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>3</td>\n",
       "      <td>Wilkes, Mrs. James (Ellen Needs)</td>\n",
       "      <td>female</td>\n",
       "      <td>47.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>363272</td>\n",
       "      <td>7.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>2</td>\n",
       "      <td>Myles, Mr. Thomas Francis</td>\n",
       "      <td>male</td>\n",
       "      <td>62.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>240276</td>\n",
       "      <td>9.6875</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>3</td>\n",
       "      <td>Wirz, Mr. Albert</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>315154</td>\n",
       "      <td>8.6625</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>3</td>\n",
       "      <td>Hirvonen, Mrs. Alexander (Helga E Lindqvist)</td>\n",
       "      <td>female</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>3101298</td>\n",
       "      <td>12.2875</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Pclass                                          Name     Sex  \\\n",
       "0          892       3                              Kelly, Mr. James    male   \n",
       "1          893       3              Wilkes, Mrs. James (Ellen Needs)  female   \n",
       "2          894       2                     Myles, Mr. Thomas Francis    male   \n",
       "3          895       3                              Wirz, Mr. Albert    male   \n",
       "4          896       3  Hirvonen, Mrs. Alexander (Helga E Lindqvist)  female   \n",
       "\n",
       "    Age  SibSp  Parch   Ticket     Fare Cabin Embarked  \n",
       "0  34.5      0      0   330911   7.8292   NaN        Q  \n",
       "1  47.0      1      0   363272   7.0000   NaN        S  \n",
       "2  62.0      0      0   240276   9.6875   NaN        Q  \n",
       "3  27.0      0      0   315154   8.6625   NaN        S  \n",
       "4  22.0      1      1  3101298  12.2875   NaN        S  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train = pd.read_csv(\"train.csv\")\n",
    "test = pd.read_csv(\"test.csv\")\n",
    "test.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>113803</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  \\\n",
       "0            1         0       3   \n",
       "1            2         1       1   \n",
       "2            3         1       3   \n",
       "3            4         1       1   \n",
       "4            5         0       3   \n",
       "\n",
       "                                                Name     Sex   Age  SibSp  \\\n",
       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
       "\n",
       "   Parch            Ticket     Fare Cabin Embarked  \n",
       "0      0         A/5 21171   7.2500   NaN        S  \n",
       "1      0          PC 17599  71.2833   C85        C  \n",
       "2      0  STON/O2. 3101282   7.9250   NaN        S  \n",
       "3      0            113803  53.1000  C123        S  \n",
       "4      0            373450   8.0500   NaN        S  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((891, 12), (418, 11))"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train.shape , test.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 891 entries, 0 to 890\n",
      "Data columns (total 12 columns):\n",
      "PassengerId    891 non-null int64\n",
      "Survived       891 non-null int64\n",
      "Pclass         891 non-null int64\n",
      "Name           891 non-null object\n",
      "Sex            891 non-null object\n",
      "Age            714 non-null float64\n",
      "SibSp          891 non-null int64\n",
      "Parch          891 non-null int64\n",
      "Ticket         891 non-null object\n",
      "Fare           891 non-null float64\n",
      "Cabin          204 non-null object\n",
      "Embarked       889 non-null object\n",
      "dtypes: float64(2), int64(5), object(5)\n",
      "memory usage: 83.6+ KB\n"
     ]
    }
   ],
   "source": [
    "train.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 418 entries, 0 to 417\n",
      "Data columns (total 11 columns):\n",
      "PassengerId    418 non-null int64\n",
      "Pclass         418 non-null int64\n",
      "Name           418 non-null object\n",
      "Sex            418 non-null object\n",
      "Age            332 non-null float64\n",
      "SibSp          418 non-null int64\n",
      "Parch          418 non-null int64\n",
      "Ticket         418 non-null object\n",
      "Fare           417 non-null float64\n",
      "Cabin          91 non-null object\n",
      "Embarked       418 non-null object\n",
      "dtypes: float64(2), int64(4), object(5)\n",
      "memory usage: 36.0+ KB\n"
     ]
    }
   ],
   "source": [
    "test.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据背景\n",
    "\n",
    "train.csv\n",
    "* PassengerId 乘客ID\n",
    "* Survived 是否幸存。0遇难，1幸存\n",
    "* Pclass 船舱等级，1Upper，2Middle，3Lower\n",
    "* Name 姓名，object------------------------------\n",
    "* Sex 性别，object---------------------------\n",
    "* Age 年龄 缺失177------------------------------\n",
    "* SibSp 兄弟姐妹及配偶个数\n",
    "* Parch 父母或子女个数\n",
    "* Ticket 乘客的船票号，object------------------------\n",
    "* Fare 乘客的船票价\n",
    "* Cabin 乘客所在舱位，object,缺失687---------------------\n",
    "* Embarked 乘客登船口岸，object,缺失3------------------------\n",
    "\n",
    "test.csv同上，只是没有Survived列，待预测"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据规整\n",
    "\n",
    "去掉缺失值过多或者无关的列"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7.2500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>71.2833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.9250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>53.1000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8.0500</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass     Sex   Age  SibSp  Parch     Fare\n",
       "0            1         0       3    male  22.0      1      0   7.2500\n",
       "1            2         1       1  female  38.0      1      0  71.2833\n",
       "2            3         1       3  female  26.0      0      0   7.9250\n",
       "3            4         1       1  female  35.0      1      0  53.1000\n",
       "4            5         0       3    male  35.0      0      0   8.0500"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2 = train.loc[:,['PassengerId','Survived','Pclass','Sex','Age','SibSp','Parch','Fare']]\n",
    "train2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>3</td>\n",
       "      <td>male</td>\n",
       "      <td>34.5</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.8292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>3</td>\n",
       "      <td>female</td>\n",
       "      <td>47.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>2</td>\n",
       "      <td>male</td>\n",
       "      <td>62.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>9.6875</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>3</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8.6625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>3</td>\n",
       "      <td>female</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>12.2875</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Pclass     Sex   Age  SibSp  Parch     Fare\n",
       "0          892       3    male  34.5      0      0   7.8292\n",
       "1          893       3  female  47.0      1      0   7.0000\n",
       "2          894       2    male  62.0      0      0   9.6875\n",
       "3          895       3    male  27.0      0      0   8.6625\n",
       "4          896       3  female  22.0      1      1  12.2875"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2 = test.loc[:,['PassengerId','Pclass','Sex','Age','SibSp','Parch','Fare']]\n",
    "test2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 891 entries, 0 to 890\n",
      "Data columns (total 8 columns):\n",
      "PassengerId    891 non-null int64\n",
      "Survived       891 non-null int64\n",
      "Pclass         891 non-null int64\n",
      "Sex            891 non-null object\n",
      "Age            714 non-null float64\n",
      "SibSp          891 non-null int64\n",
      "Parch          891 non-null int64\n",
      "Fare           891 non-null float64\n",
      "dtypes: float64(2), int64(5), object(1)\n",
      "memory usage: 55.8+ KB\n"
     ]
    }
   ],
   "source": [
    "train2.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 418 entries, 0 to 417\n",
      "Data columns (total 7 columns):\n",
      "PassengerId    418 non-null int64\n",
      "Pclass         418 non-null int64\n",
      "Sex            418 non-null object\n",
      "Age            332 non-null float64\n",
      "SibSp          418 non-null int64\n",
      "Parch          418 non-null int64\n",
      "Fare           417 non-null float64\n",
      "dtypes: float64(2), int64(4), object(1)\n",
      "memory usage: 22.9+ KB\n"
     ]
    }
   ],
   "source": [
    "test2.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#需要补齐的数据列\n",
    "train2.loc[:,'Age']\n",
    "test2.loc[:,'Age']\n",
    "test2.loc[:,'Fare']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "#取中位数填充年龄Age\n",
    "age = train2['Age'].median()\n",
    "train2.loc[train2.loc[:,'Age'].isnull(),\"Age\"] = age\n",
    "test2.loc[test2.loc[:,'Age'].isnull(),\"Age\"] = age\n",
    "train2['Age']=train2['Age'].astype(np.int64)\n",
    "test2['Age']=test2['Age'].astype(np.int64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "#取众数填充票价Fale\n",
    "Fare = test2['Fare'].mode()\n",
    "Fare\n",
    "test2.loc[test2['Fare'].isnull(),'Fare']= Fare[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 418 entries, 0 to 417\n",
      "Data columns (total 7 columns):\n",
      "PassengerId    418 non-null int64\n",
      "Pclass         418 non-null int64\n",
      "Sex            418 non-null object\n",
      "Age            418 non-null int64\n",
      "SibSp          418 non-null int64\n",
      "Parch          418 non-null int64\n",
      "Fare           418 non-null float64\n",
      "dtypes: float64(1), int64(5), object(1)\n",
      "memory usage: 22.9+ KB\n",
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 891 entries, 0 to 890\n",
      "Data columns (total 8 columns):\n",
      "PassengerId    891 non-null int64\n",
      "Survived       891 non-null int64\n",
      "Pclass         891 non-null int64\n",
      "Sex            891 non-null object\n",
      "Age            891 non-null int64\n",
      "SibSp          891 non-null int64\n",
      "Parch          891 non-null int64\n",
      "Fare           891 non-null float64\n",
      "dtypes: float64(1), int64(6), object(1)\n",
      "memory usage: 55.8+ KB\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(None, None)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2.info(),train2.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据类型变换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0        male\n",
       "1      female\n",
       "2        male\n",
       "3        male\n",
       "4      female\n",
       "5        male\n",
       "6      female\n",
       "7        male\n",
       "8      female\n",
       "9        male\n",
       "10       male\n",
       "11       male\n",
       "12     female\n",
       "13       male\n",
       "14     female\n",
       "15     female\n",
       "16       male\n",
       "17       male\n",
       "18     female\n",
       "19     female\n",
       "20       male\n",
       "21       male\n",
       "22     female\n",
       "23       male\n",
       "24     female\n",
       "25       male\n",
       "26     female\n",
       "27       male\n",
       "28       male\n",
       "29       male\n",
       "        ...  \n",
       "388      male\n",
       "389      male\n",
       "390      male\n",
       "391    female\n",
       "392      male\n",
       "393      male\n",
       "394      male\n",
       "395    female\n",
       "396      male\n",
       "397    female\n",
       "398      male\n",
       "399      male\n",
       "400    female\n",
       "401      male\n",
       "402    female\n",
       "403      male\n",
       "404      male\n",
       "405      male\n",
       "406      male\n",
       "407      male\n",
       "408    female\n",
       "409    female\n",
       "410    female\n",
       "411    female\n",
       "412    female\n",
       "413      male\n",
       "414    female\n",
       "415      male\n",
       "416      male\n",
       "417      male\n",
       "Name: Sex, Length: 418, dtype: object"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#字符串装换为数值\n",
    "train2.loc[:,'Sex']\n",
    "test2.loc[:,'Sex']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>22</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7.2500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>38</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>71.2833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>26</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.9250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>53.1000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>35</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8.0500</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  Sex  Age  SibSp  Parch     Fare\n",
       "0            1         0       3    1   22      1      0   7.2500\n",
       "1            2         1       1    0   38      1      0  71.2833\n",
       "2            3         1       3    0   26      0      0   7.9250\n",
       "3            4         1       1    0   35      1      0  53.1000\n",
       "4            5         0       3    1   35      0      0   8.0500"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2['Sex']=train2['Sex'].map({'female': 0, 'male': 1}).astype(np.int)\n",
    "train2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>34</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>7.8292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>47</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>7.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>62</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>9.6875</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>27</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8.6625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>22</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>12.2875</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Pclass  Sex  Age  SibSp  Parch     Fare\n",
       "0          892       3    1   34      0      0   7.8292\n",
       "1          893       3    0   47      1      0   7.0000\n",
       "2          894       2    1   62      0      0   9.6875\n",
       "3          895       3    1   27      0      0   8.6625\n",
       "4          896       3    0   22      1      1  12.2875"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2['Sex']=test2['Sex'].map({'female': 0, 'male': 1}).astype(np.int)\n",
    "test2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 891 entries, 0 to 890\n",
      "Data columns (total 8 columns):\n",
      "PassengerId    891 non-null int64\n",
      "Survived       891 non-null int64\n",
      "Pclass         891 non-null int64\n",
      "Sex            891 non-null int32\n",
      "Age            891 non-null int64\n",
      "SibSp          891 non-null int64\n",
      "Parch          891 non-null int64\n",
      "Fare           891 non-null float64\n",
      "dtypes: float64(1), int32(1), int64(6)\n",
      "memory usage: 52.3 KB\n",
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 418 entries, 0 to 417\n",
      "Data columns (total 7 columns):\n",
      "PassengerId    418 non-null int64\n",
      "Pclass         418 non-null int64\n",
      "Sex            418 non-null int32\n",
      "Age            418 non-null int64\n",
      "SibSp          418 non-null int64\n",
      "Parch          418 non-null int64\n",
      "Fare           418 non-null float64\n",
      "dtypes: float64(1), int32(1), int64(5)\n",
      "memory usage: 21.3 KB\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(None, None)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.info(),test2.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数据重构\n",
    "\n",
    "将sibsp、Parch特征构建两个新特征\n",
    "\n",
    "* sibsp 兄弟姐妹及配偶个数\n",
    "* Parch 父母或者子女的个数\n",
    "* 家庭人口总数familysize\n",
    "* 是否单身 isalone"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "#familysize\n",
    "train2['familysize'] = train2['SibSp']+train2['Parch']+1\n",
    "test2['familysize'] = test2['SibSp']+test2['Parch']+1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "#isalone\n",
    "train2['isalone'] = 0\n",
    "train2.loc[train2['familysize']==1,'isalone'] = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "#isalone\n",
    "test2['isalone'] = 0\n",
    "test2.loc[test2['familysize']==1,'isalone'] = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "#去掉sibsp、Parch列\n",
    "train2 = train2.drop(['SibSp','Parch'],axis=1)\n",
    "test2 = test2.drop(['SibSp','Parch'],axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>22</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>38</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>26</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>35</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  Sex  Age     Fare  familysize  isalone\n",
       "0            1         0       3    1   22   7.2500           2        0\n",
       "1            2         1       1    0   38  71.2833           2        0\n",
       "2            3         1       3    0   26   7.9250           1        1\n",
       "3            4         1       1    0   35  53.1000           2        0\n",
       "4            5         0       3    1   35   8.0500           1        1"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>34</td>\n",
       "      <td>7.8292</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>47</td>\n",
       "      <td>7.0000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>62</td>\n",
       "      <td>9.6875</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>27</td>\n",
       "      <td>8.6625</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>22</td>\n",
       "      <td>12.2875</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Pclass  Sex  Age     Fare  familysize  isalone\n",
       "0          892       3    1   34   7.8292           1        1\n",
       "1          893       3    0   47   7.0000           2        0\n",
       "2          894       2    1   62   9.6875           1        1\n",
       "3          895       3    1   27   8.6625           1        1\n",
       "4          896       3    0   22  12.2875           3        0"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.383838</td>\n",
       "      <td>2.308642</td>\n",
       "      <td>0.647587</td>\n",
       "      <td>29.345679</td>\n",
       "      <td>32.204208</td>\n",
       "      <td>1.904602</td>\n",
       "      <td>0.602694</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>257.353842</td>\n",
       "      <td>0.486592</td>\n",
       "      <td>0.836071</td>\n",
       "      <td>0.477990</td>\n",
       "      <td>13.028212</td>\n",
       "      <td>49.693429</td>\n",
       "      <td>1.613459</td>\n",
       "      <td>0.489615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>223.500000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>22.000000</td>\n",
       "      <td>7.910400</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>28.000000</td>\n",
       "      <td>14.454200</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>668.500000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>35.000000</td>\n",
       "      <td>31.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>80.000000</td>\n",
       "      <td>512.329200</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       PassengerId    Survived      Pclass         Sex         Age  \\\n",
       "count   891.000000  891.000000  891.000000  891.000000  891.000000   \n",
       "mean    446.000000    0.383838    2.308642    0.647587   29.345679   \n",
       "std     257.353842    0.486592    0.836071    0.477990   13.028212   \n",
       "min       1.000000    0.000000    1.000000    0.000000    0.000000   \n",
       "25%     223.500000    0.000000    2.000000    0.000000   22.000000   \n",
       "50%     446.000000    0.000000    3.000000    1.000000   28.000000   \n",
       "75%     668.500000    1.000000    3.000000    1.000000   35.000000   \n",
       "max     891.000000    1.000000    3.000000    1.000000   80.000000   \n",
       "\n",
       "             Fare  familysize     isalone  \n",
       "count  891.000000  891.000000  891.000000  \n",
       "mean    32.204208    1.904602    0.602694  \n",
       "std     49.693429    1.613459    0.489615  \n",
       "min      0.000000    1.000000    0.000000  \n",
       "25%      7.910400    1.000000    0.000000  \n",
       "50%     14.454200    1.000000    1.000000  \n",
       "75%     31.000000    2.000000    1.000000  \n",
       "max    512.329200   11.000000    1.000000  "
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "      <td>418.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>1100.500000</td>\n",
       "      <td>2.265550</td>\n",
       "      <td>0.636364</td>\n",
       "      <td>29.779904</td>\n",
       "      <td>35.560497</td>\n",
       "      <td>1.839713</td>\n",
       "      <td>0.605263</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>120.810458</td>\n",
       "      <td>0.841838</td>\n",
       "      <td>0.481622</td>\n",
       "      <td>12.686191</td>\n",
       "      <td>55.857145</td>\n",
       "      <td>1.519072</td>\n",
       "      <td>0.489380</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>892.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>996.250000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>23.000000</td>\n",
       "      <td>7.895800</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>1100.500000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>28.000000</td>\n",
       "      <td>14.454200</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>1204.750000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>35.750000</td>\n",
       "      <td>31.471875</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>1309.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>76.000000</td>\n",
       "      <td>512.329200</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       PassengerId      Pclass         Sex         Age        Fare  \\\n",
       "count   418.000000  418.000000  418.000000  418.000000  418.000000   \n",
       "mean   1100.500000    2.265550    0.636364   29.779904   35.560497   \n",
       "std     120.810458    0.841838    0.481622   12.686191   55.857145   \n",
       "min     892.000000    1.000000    0.000000    0.000000    0.000000   \n",
       "25%     996.250000    1.000000    0.000000   23.000000    7.895800   \n",
       "50%    1100.500000    3.000000    1.000000   28.000000   14.454200   \n",
       "75%    1204.750000    3.000000    1.000000   35.750000   31.471875   \n",
       "max    1309.000000    3.000000    1.000000   76.000000  512.329200   \n",
       "\n",
       "       familysize     isalone  \n",
       "count  418.000000  418.000000  \n",
       "mean     1.839713    0.605263  \n",
       "std      1.519072    0.489380  \n",
       "min      1.000000    0.000000  \n",
       "25%      1.000000    0.000000  \n",
       "50%      1.000000    1.000000  \n",
       "75%      2.000000    1.000000  \n",
       "max     11.000000    1.000000  "
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "描述性分析"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>22</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>38</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>26</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>35</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  Sex  Age     Fare  familysize  isalone\n",
       "0            1         0       3    1   22   7.2500           2        0\n",
       "1            2         1       1    0   38  71.2833           2        0\n",
       "2            3         1       3    0   26   7.9250           1        1\n",
       "3            4         1       1    0   35  53.1000           2        0\n",
       "4            5         0       3    1   35   8.0500           1        1"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Survived</th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>All</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Pclass</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>80</td>\n",
       "      <td>136</td>\n",
       "      <td>216</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>97</td>\n",
       "      <td>87</td>\n",
       "      <td>184</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>372</td>\n",
       "      <td>119</td>\n",
       "      <td>491</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>All</th>\n",
       "      <td>549</td>\n",
       "      <td>342</td>\n",
       "      <td>891</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Survived    0    1  All\n",
       "Pclass                 \n",
       "1          80  136  216\n",
       "2          97   87  184\n",
       "3         372  119  491\n",
       "All       549  342  891"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#统计频率用交叉表\n",
    "pclass = pd.crosstab(train2.Pclass,train2.Survived,margins=True)\n",
    "pclass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x2808fec6860>"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEGCAYAAABrQF4qAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFg5JREFUeJzt3XGQVeWZ5/Hvk4YFJxgN0BqkiU0iqRFiJLHBpCxTrEkpISl0qoJAzSpGE0zEWqZ2dmswVVlxa6jKZDOTSkzWlSkdyERFopmCOI4Z1sQ4SYza7SAR0RKjE1oYBYwkxCiCz/7RB9OjDX27+3bf7pfvp+rWPfc97znnOd1Vv3vue885NzITSVK53tboAiRJg8ugl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBVuVKMLAJg4cWK2trY2ugxJGlE6Ojr2ZGZzb/2GRdC3trbS3t7e6DIkaUSJiH+rpZ9DN5JUOINekgpn0EtS4YbFGH1PXnvtNTo7O3nllVcaXcqAjR07lpaWFkaPHt3oUiQdg4Zt0Hd2dnL88cfT2tpKRDS6nH7LTPbu3UtnZydTp05tdDmSjkHDdujmlVdeYcKECSM65AEiggkTJhTxyUTSyDRsgx4Y8SF/WCn7IWlkGtZBL0kauGE7Rn8kq1at4tZbb6WpqYm3ve1t3HjjjZx99tkDWufGjRt5/PHHWbFixYDrGzduHPv37x/weqQSta74xyHd3rNf/uSQbm+4GlFB/8ADD3DXXXfxyCOPMGbMGPbs2cOBAwdqWvbgwYOMGtXz7s6fP5/58+fXs1RJGjZG1NDNrl27mDhxImPGjAFg4sSJnHLKKbS2trJnzx4A2tvbmTNnDgArV65k6dKlnH/++Vx66aWcffbZbN269Y31zZkzh46ODtasWcPVV1/Nvn37aG1t5fXXXwfg5ZdfZsqUKbz22ms8/fTTzJ07l7POOotzzz2XJ554AoBnnnmGj3zkI8yaNYsvfelLQ/jXkKTajKigP//889mxYwfve9/7uOqqq/jxj3/c6zIdHR1s2LCBW2+9lUWLFrF+/Xqg601j586dnHXWWW/0PeGEEzjzzDPfWO/3v/99LrjgAkaPHs3SpUu5/vrr6ejo4Ktf/SpXXXUVAMuXL+cLX/gCDz/8MO9617sGYa8laWBGVNCPGzeOjo4OVq9eTXNzMwsXLmTNmjVHXWb+/Pkcd9xxAFx88cV897vfBWD9+vUsWLDgLf0XLlzI7bffDsC6detYuHAh+/fv52c/+xkLFixg5syZXHnllezatQuAn/70pyxevBiASy65pF67Kkl1M6LG6AGampqYM2cOc+bM4YwzzmDt2rWMGjXqjeGWN5+v/va3v/2N6cmTJzNhwgS2bNnC7bffzo033viW9c+fP59rrrmGF198kY6ODs477zx+97vfceKJJ7J58+Yea/L0SUnD2Yg6on/yySd56qmn3ni9efNmTj31VFpbW+no6ADgzjvvPOo6Fi1axFe+8hX27dvHGWec8Zb548aNY/bs2SxfvpxPfepTNDU18Y53vIOpU6e+8WkgM3n00UcBOOecc1i3bh0At9xyS132U5LqaUQF/f79+1myZAnTp0/nAx/4AI8//jgrV67k2muvZfny5Zx77rk0NTUddR2f/vSnWbduHRdffPER+yxcuJDvfOc7LFy48I22W265hZtuuokzzzyTGTNmsGHDBgC+/vWv861vfYtZs2axb9+++uyoJNVRZGaja6CtrS3f/MMj27Zt4/TTT29QRfVX2v5I/eF59PUVER2Z2dZbv16P6CNibEQ8FBGPRsTWiLiual8TEc9ExObqMbNqj4j4RkRsj4gtEfGhge+OJKm/avky9lXgvMzcHxGjgZ9ExD9V8/5HZt7xpv6fAKZVj7OBG6pnSVID9HpEn10OX9M/unocbbznQuDb1XI/B06MiEkDL1WS1B81fRkbEU0RsRl4AdiUmQ9Ws1ZVwzNfi4gxVdtkYEe3xTurNklSA9QU9Jl5KDNnAi3A7Ih4P3AN8MfALGA88BdV955OKn/LJ4CIWBoR7RHRvnv37n4VL0nqXZ9Or8zMl4D7gLmZuasannkV+DtgdtWtE5jSbbEWYGcP61qdmW2Z2dbc3Nyv4iVJvev1y9iIaAZey8yXIuI44OPAX0XEpMzcFV2XhV4EPFYtshG4OiLW0fUl7L7M3DUYxdf7VK1aT8W65557WL58OYcOHeKzn/1sXW5vLEmDpZazbiYBayOiia5PAOsz866I+GH1JhDAZuDzVf+7gXnAduBl4DP1L7txDh06xLJly9i0aRMtLS3MmjWL+fPnM3369EaXJkk96jXoM3ML8MEe2s87Qv8Elg28tOHpoYce4rTTTuM973kP0HVLhQ0bNhj0koatEXULhOHgueeeY8qUP3wF0dLSwnPPPdfAiiTp6Az6PurplhHevVLScGbQ91FLSws7dvzhMoHOzk5OOeWUBlYkSUdn0PfRrFmzeOqpp3jmmWc4cOAA69at8/dmJQ1rI+6HR7prxJ3pRo0axTe/+U0uuOACDh06xOWXX86MGTOGvA5JqtWIDvpGmTdvHvPmzWt0GZJUE4duJKlwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuFG9umVK0+o8/r29drl8ssv56677uKkk07iscce67W/JDWaR/R9dNlll3HPPfc0ugxJqplB30cf/ehHGT9+fKPLkKSaGfSSVDiDXpIKZ9BLUuEMekkqXK+nV0bEWOB+YEzV/47MvDYipgLrgPHAI8AlmXkgIsYA3wbOAvYCCzPz2UGpvobTIett8eLF3HfffezZs4eWlhauu+46rrjiiiGvQ5JqVct59K8C52Xm/ogYDfwkIv4J+G/A1zJzXUT8X+AK4Ibq+deZeVpELAL+Clg4SPUPudtuu63RJUhSn/Q6dJNd9lcvR1ePBM4D7qja1wIXVdMXVq+p5n8s/FFVSWqYmsboI6IpIjYDLwCbgKeBlzLzYNWlE5hcTU8GdgBU8/cBE3pY59KIaI+I9t27dw9sLyRJR1RT0GfmocycCbQAs4HTe+pWPfd09J5vachcnZltmdnW3Nx8pO3WUt6wV8p+SBqZ+nTWTWa+BNwHfBg4MSIOj/G3ADur6U5gCkA1/wTgxb4WNnbsWPbu3TviQzIz2bt3L2PHjm10KZKOUbWcddMMvJaZL0XEccDH6fqC9UfAp+k682YJsKFaZGP1+oFq/g+zH2nd0tJCZ2cnJQzrjB07lpaWlkaXIekYVctZN5OAtRHRRNcngPWZeVdEPA6si4i/BP4VuKnqfxPw9xGxna4j+UX9KWz06NFMnTq1P4tKkrrpNegzcwvwwR7af0nXeP2b218BFtSlOknSgHllrCQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klS4XoM+IqZExI8iYltEbI2I5VX7yoh4LiI2V4953Za5JiK2R8STEXHBYO6AJOnoev1xcOAg8OeZ+UhEHA90RMSmat7XMvOr3TtHxHRgETADOAX4fxHxvsw8VM/CJUm16fWIPjN3ZeYj1fRvgW3A5KMsciGwLjNfzcxngO3A7HoUK0nquz6N0UdEK/BB4MGq6eqI2BIRN0fEO6u2ycCObot10sMbQ0QsjYj2iGjfvXt3nwuXJNWm5qCPiHHAncCfZeZvgBuA9wIzgV3AXx/u2sPi+ZaGzNWZ2ZaZbc3NzX0uXJJUm5qCPiJG0xXyt2Tm9wAy8/nMPJSZrwN/yx+GZzqBKd0WbwF21q9kSVJf1HLWTQA3Adsy82+6tU/q1u1PgMeq6Y3AoogYExFTgWnAQ/UrWZLUF7WcdXMOcAnwi4jYXLV9EVgcETPpGpZ5FrgSIDO3RsR64HG6zthZ5hk3ktQ4vQZ9Zv6Ensfd7z7KMquAVQOoS5JUJ14ZK0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBWu16CPiCkR8aOI2BYRWyNiedU+PiI2RcRT1fM7q/aIiG9ExPaI2BIRHxrsnZAkHVktR/QHgT/PzNOBDwPLImI6sAK4NzOnAfdWrwE+AUyrHkuBG+petSSpZr0GfWbuysxHqunfAtuAycCFwNqq21rgomr6QuDb2eXnwIkRManulUuSatKnMfqIaAU+CDwInJyZu6DrzQA4qeo2GdjRbbHOqu3N61oaEe0R0b579+6+Vy5JqknNQR8R44A7gT/LzN8crWsPbfmWhszVmdmWmW3Nzc21liFJ6qOagj4iRtMV8rdk5veq5ucPD8lUzy9U7Z3AlG6LtwA761OuJKmvajnrJoCbgG2Z+TfdZm0EllTTS4AN3dovrc6++TCw7/AQjyRp6I2qoc85wCXALyJic9X2ReDLwPqIuAL4FbCgmnc3MA/YDrwMfKauFUuS+qTXoM/Mn9DzuDvAx3ron8CyAdYlSaoTr4yVpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCtdr0EfEzRHxQkQ81q1tZUQ8FxGbq8e8bvOuiYjtEfFkRFwwWIVLkmpTyxH9GmBuD+1fy8yZ1eNugIiYDiwCZlTL/J+IaKpXsZKkvus16DPzfuDFGtd3IbAuM1/NzGeA7cDsAdQnSRqggYzRXx0RW6qhnXdWbZOBHd36dFZtbxERSyOiPSLad+/ePYAyJElH09+gvwF4LzAT2AX8ddUePfTNnlaQmaszsy0z25qbm/tZhiSpN/0K+sx8PjMPZebrwN/yh+GZTmBKt64twM6BlShJGoh+BX1ETOr28k+Aw2fkbAQWRcSYiJgKTAMeGliJkqSBGNVbh4i4DZgDTIyITuBaYE5EzKRrWOZZ4EqAzNwaEeuBx4GDwLLMPDQ4pUuSatFr0Gfm4h6abzpK/1XAqoEUJUmqH6+MlaTCGfSSVLheh25UBytPGOLt7Rva7Uka1jyil6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOC6YklcuLFQGP6CWpeAa9JBXOoJekwhn0klQ4g16SCmfQS1Lheg36iLg5Il6IiMe6tY2PiE0R8VT1/M6qPSLiGxGxPSK2RMSHBrN4SVLvajmiXwPMfVPbCuDezJwG3Fu9BvgEMK16LAVuqE+ZkqT+6jXoM/N+4MU3NV8IrK2m1wIXdWv/dnb5OXBiREyqV7GSpL7r7xj9yZm5C6B6Pqlqnwzs6Navs2qTJDVIvb+MjR7asseOEUsjoj0i2nfv3l3nMiRJh/X3XjfPR8SkzNxVDc28ULV3AlO69WsBdva0gsxcDawGaGtr6/HNQOpJ64p/HNLtPfvlTw7p9qR66+8R/UZgSTW9BNjQrf3S6uybDwP7Dg/xSJIao9cj+oi4DZgDTIyITuBa4MvA+oi4AvgVsKDqfjcwD9gOvAx8ZhBqliT1Qa9Bn5mLjzDrYz30TWDZQIuSJNWPV8ZKUuEMekkqnL8wJfXGXynSCOcRvSQVzqCXpMIZ9JJUuGNyjH7Ir6wcO6Sbk6T/wCN6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBVuQDc1i4hngd8Ch4CDmdkWEeOB24FW4Fng4sz89cDKlCT1Vz2O6P9zZs7MzLbq9Qrg3sycBtxbvZYkNchgDN1cCKytptcCFw3CNiRJNRpo0CfwzxHRERFLq7aTM3MXQPV80gC3IUkagIH+8Mg5mbkzIk4CNkXEE7UuWL0xLAV497vfPcAyJElHMqAj+szcWT2/APwDMBt4PiImAVTPLxxh2dWZ2ZaZbc3NzQMpQ5J0FP0O+oh4e0Qcf3gaOB94DNgILKm6LQE2DLRISVL/DWTo5mTgHyLi8Hpuzcx7IuJhYH1EXAH8Clgw8DIlSf3V76DPzF8CZ/bQvhf42ECKkiTVj1fGSlLhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUbtKCPiLkR8WREbI+IFYO1HUnS0Q1K0EdEE/At4BPAdGBxREwfjG1Jko5usI7oZwPbM/OXmXkAWAdcOEjbkiQdxahBWu9kYEe3153A2d07RMRSYGn1cn9EPDlItTRcwERgz5Bt8LoYsk0dC/z/jVzHwP/u1Fo6DVbQ97S3+R9eZK4GVg/S9oeViGjPzLZG16H+8f83cvm/6zJYQzedwJRur1uAnYO0LUnSUQxW0D8MTIuIqRHxn4BFwMZB2pYk6SgGZegmMw9GxNXAD4Am4ObM3DoY2xohjokhqoL5/xu5/N8BkZm995IkjVheGStJhTPoJalwBr0kFW6wzqOXpCEXEbOBzMyHq9uuzAWeyMy7G1xaQ/llrNRNRPwxXVd2P5iZ+7u1z83MexpXmXoTEdfSdX+tUcAmuq7Gvw/4OPCDzFzVuOoay6AfQhHxmcz8u0bXoZ5FxH8FlgHbgJnA8szcUM17JDM/1Mj6dHQR8Qu6/m9jgH8HWjLzNxFxHF1v3B9oaIEN5NDN0LoOMOiHr88BZ2Xm/ohoBe6IiNbM/Do939ZDw8vBzDwEvBwRT2fmbwAy8/cR8XqDa2sog77OImLLkWYBJw9lLeqzpsPDNZn5bETMoSvsT8WgHwkORMQfZebLwFmHGyPiBMCgV12dDFwA/PpN7QH8bOjLUR/8e0TMzMzNANWR/aeAm4EzGluaavDRzHwVIDO7B/toYEljShoeDPr6uwsYdzgsuouI+4a+HPXBpcDB7g2ZeRC4NCJubExJqtXhkO+hfQ9DeaviYcgvYyWpcF4wJUmFM+glqXAGvY4JEXEoIjZHxGMR8d2I+KOj9F0ZEf99KOuTBpNBr2PF7zNzZma+HzgAfL7RBUlDxaDXsehfgNMAIuLSiNgSEY9GxN+/uWNEfC4iHq7m33n4k0BELKg+HTwaEfdXbTMi4qHqk8OWiJg2pHslHYFn3eiYEBH7M3NcRIwC7gTuAe4Hvgeck5l7ImJ8Zr4YESuB/Zn51YiYkJl7q3X8JfB8Zl5fXW4/NzOfi4gTM/OliLge+Hlm3lL9hGZTZv6+ITssdeMRvY4Vx0XEZqAd+BVwE3AecEd1njWZ+WIPy70/Iv6lCvY/BWZU7T8F1kTE5+j6uUyAB4AvRsRfAKca8houvGBKx4rfZ+bM7g0REUBvH2nXABdl5qMRcRkwByAzPx8RZwOfBDZXV9TeGhEPVm0/iIjPZuYP67wfUp95RK9j2b3AxRExASAixvfQ53hgV0SMpuuInqrvezPzwcz8n3RddTklIt4D/DIzvwFsBI7ZuyVqePGIXseszNwaEauAH0fEIeBfgcve1O1LwIPAvwG/oCv4Af539WVr0PWG8SiwAvgvEfEaXbfJ/V+DvhNSDfwyVpIK59CNJBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mF+//ETzqtU1jvGAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "pclass.plot.bar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结论：下等船舱人数最多，存活率最低，船舱等级越高，存活率越高。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Survived</th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Sex</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>81</td>\n",
       "      <td>233</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>468</td>\n",
       "      <td>109</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Survived    0    1\n",
       "Sex               \n",
       "0          81  233\n",
       "1         468  109"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#统计性别\n",
    "sex = pd.crosstab(train2.Sex,train2.Survived)\n",
    "sex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x2808ff40b00>"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEGCAYAAABrQF4qAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAEZNJREFUeJzt3X+MVeWdx/H31xl+uMVCC2OrDDo00l1lrbaAaIyG1Y1YatBsRCAbpZWGpmpK0022uJtGTGq2Nc3aak0jWbrSVvnRuhuQtjauLW5a2ypjkQrUBatbRkgFVHapoQh+9485UBZG5wL3znWeeb+SyT3nOc8553vJ8Jlnnjnn3MhMJEnlOqnZBUiSGsugl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBWutdkFAIwaNSo7OjqaXYYk9SudnZ07M7Ott37viKDv6Ohg7dq1zS5DkvqViPjvWvo5dSNJhTPoJalwBr0kFe4dMUcvSfX2xhtv0NXVxd69e5tdygkbOnQo7e3tDBo06Lj2N+glFamrq4tTTjmFjo4OIqLZ5Ry3zGTXrl10dXUxduzY4zqGUzeSirR3715GjhzZr0MeICIYOXLkCf1mYtBLKlZ/D/mDTvR9GPSSVDjn6KUCdCz4frNLqMmLX/pYs0vgjjvu4MEHH6SlpYWTTjqJ++67j8mTJ5/QMVetWsXGjRtZsGDBCdc3bNgw9uzZc8LHOZxBL2nA+PnPf87q1at5+umnGTJkCDt37mTfvn017bt//35aW3uOzOnTpzN9+vR6llpXTt1IGjC2b9/OqFGjGDJkCACjRo3i9NNPp6Ojg507dwKwdu1apkyZAsDChQuZN28eV1xxBTfccAOTJ09mw4YNh443ZcoUOjs7uf/++7nlllvYvXs3HR0dvPnmmwC8/vrrjBkzhjfeeIPnn3+eK6+8kgkTJnDJJZfwm9/8BoAXXniBiy66iEmTJvGFL3yhIe/boJc0YFxxxRVs3bqVD37wg9x00008/vjjve7T2dnJypUrefDBB5k1axYrVqwAun9obNu2jQkTJhzqO3z4cM4777xDx3344YeZOnUqgwYNYt68edxzzz10dnbyla98hZtuugmA+fPn8+lPf5qnnnqK97///Q141wa9pAFk2LBhdHZ2smjRItra2pg5cyb333//2+4zffp0Tj75ZACuu+46vvvd7wKwYsUKZsyYcVT/mTNnsnz5cgCWLVvGzJkz2bNnD0888QQzZszg/PPP51Of+hTbt28H4Gc/+xmzZ88G4Prrr6/XW/1/nKOXNKC0tLQwZcoUpkyZwrnnnsuSJUtobW09NN1y5PXq73rXuw4tjx49mpEjR7J+/XqWL1/Offfdd9Txp0+fzq233sorr7xCZ2cnl112GX/4wx8YMWIE69at67GmRl8G6ohe0oDx3HPPsXnz5kPr69at48wzz6Sjo4POzk4AHnroobc9xqxZs7jzzjvZvXs355577lHbhw0bxgUXXMD8+fO56qqraGlp4d3vfjdjx4499NtAZvLMM88AcPHFF7Ns2TIAHnjggbq8zyMZ9JIGjD179jBnzhzOOeccPvShD7Fx40YWLlzIbbfdxvz587nkkktoaWl522Nce+21LFu2jOuuu+4t+8ycOZPvfOc7zJw581DbAw88wOLFiznvvPMYP348K1euBOBrX/sa9957L5MmTWL37t31eaNHiMxsyIGPxcSJE9MPHpGOn9fRH23Tpk2cffbZfXa+Ruvp/UREZ2ZO7G1fR/SSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcN4ZK2nAqvdlqbVcPvrII48wf/58Dhw4wCc/+cm6PNq4N47oJamPHDhwgJtvvpkf/vCHbNy4kaVLl7Jx48aGn9egl6Q+8uSTT3LWWWfxgQ98gMGDBzNr1qxDd8g2kkEvSX3kpZdeYsyYMYfW29vbeemllxp+XoNekvpIT4+c6YsPMDfoJamPtLe3s3Xr1kPrXV1dnH766Q0/r0EvSX1k0qRJbN68mRdeeIF9+/axbNmyPvmsWS+vlDRg9eXTNAFaW1v5+te/ztSpUzlw4AA33ngj48ePb/x5G34GSdIh06ZNY9q0aX16TqduJKlwBr0kFc6gl6TC1Rz0EdESEb+KiNXV+tiI+GVEbI6I5RExuGofUq1vqbZ3NKZ0SVItjmVEPx/YdNj6l4G7MnMc8Cowt2qfC7yamWcBd1X9JElNUlPQR0Q78DHgX6r1AC4Dvld1WQJcUy1fXa1Tbb88+uLWL0lSj2q9vPKrwN8Dp1TrI4HXMnN/td4FjK6WRwNbATJzf0TsrvrvPPyAETEPmAdwxhlnHG/9knT8Fg6v8/F299rlxhtvZPXq1Zx66qk8++yz9T3/W+h1RB8RVwEvZ2bn4c09dM0atv2pIXNRZk7MzIltbW01FStJ/d3HP/5xHnnkkT49Zy0j+ouB6RExDRgKvJvuEf6IiGitRvXtwLaqfxcwBuiKiFZgOPBK3SuXpH7o0ksv5cUXX+zTc/Y6os/MWzOzPTM7gFnAjzPzb4GfANdW3eYABx+qvKpap9r+4+zpkW2SpD5xItfRfx74XERsoXsOfnHVvhgYWbV/Dmj852RJkt7SMT3rJjPXAGuq5d8CF/TQZy8wow61SZLqwDtjJalwPr1S0sBVw+WQ9TZ79mzWrFnDzp07aW9v5/bbb2fu3Lm973gCDHpJ6kNLly7t83M6dSNJhTPoJalwBr2kYpVyC8+Jvg+DXlKRhg4dyq5du/p92Gcmu3btYujQocd9DP8YK6lI7e3tdHV1sWPHjmaXcsKGDh1Ke3v7ce9v0Esq0qBBgxg7dmyzy3hHcOpGkgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4XoN+ogYGhFPRsQzEbEhIm6v2sdGxC8jYnNELI+IwVX7kGp9S7W9o7FvQZL0dmoZ0f8RuCwzzwPOB66MiAuBLwN3ZeY44FVgbtV/LvBqZp4F3FX1kyQ1Sa9Bn932VKuDqq8ELgO+V7UvAa6plq+u1qm2Xx4RUbeKJUnHpKY5+ohoiYh1wMvAo8DzwGuZub/q0gWMrpZHA1sBqu27gZH1LFqSVLuagj4zD2Tm+UA7cAFwdk/dqteeRu95ZENEzIuItRGxdseOHbXWK0k6Rsd01U1mvgasAS4ERkREa7WpHdhWLXcBYwCq7cOBV3o41qLMnJiZE9va2o6veklSr2q56qYtIkZUyycDfw1sAn4CXFt1mwOsrJZXVetU23+cmUeN6CVJfaO19y6cBiyJiBa6fzCsyMzVEbERWBYRXwR+BSyu+i8Gvh0RW+geyc9qQN2SpBr1GvSZuR74cA/tv6V7vv7I9r3AjLpUJ0k6Yd4ZK0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKlxrswtQAywc3uwKarNwd7MrkAYER/SSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1Lheg36iBgTET+JiE0RsSEi5lft742IRyNic/X6nqo9IuLuiNgSEesj4iONfhOSpLdWy4h+P/B3mXk2cCFwc0ScAywAHsvMccBj1TrAR4Fx1dc84Bt1r1qSVLNegz4zt2fm09Xy/wKbgNHA1cCSqtsS4Jpq+WrgW9ntF8CIiDit7pVLkmpyTHP0EdEBfBj4JfC+zNwO3T8MgFOrbqOBrYft1lW1HXmseRGxNiLW7tix49grlyTVpOagj4hhwEPAZzPzf96uaw9teVRD5qLMnJiZE9va2motQ5J0jGoK+ogYRHfIP5CZ/1Y1//7glEz1+nLV3gWMOWz3dmBbfcqVJB2rWq66CWAxsCkz//mwTauAOdXyHGDlYe03VFffXAjsPjjFI0nqe7V8lODFwPXAryNiXdX2D8CXgBURMRf4HTCj2vYDYBqwBXgd+ERdK5YkHZNegz4zf0rP8+4Al/fQP4GbT7AuSVKdeGesJBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMK1NrsASQPIwuHNrqA2C3c3u4K66nVEHxHfjIiXI+LZw9reGxGPRsTm6vU9VXtExN0RsSUi1kfERxpZvCSpd7VM3dwPXHlE2wLgscwcBzxWrQN8FBhXfc0DvlGfMiVJx6vXoM/M/wReOaL5amBJtbwEuOaw9m9lt18AIyLitHoVK0k6dsf7x9j3ZeZ2gOr11Kp9NLD1sH5dVdtRImJeRKyNiLU7duw4zjIkSb2p91U30UNb9tQxMxdl5sTMnNjW1lbnMiRJBx1v0P/+4JRM9fpy1d4FjDmsXzuw7fjLkySdqOMN+lXAnGp5DrDysPYbqqtvLgR2H5zikSQ1R6/X0UfEUmAKMCoiuoDbgC8BKyJiLvA7YEbV/QfANGAL8DrwiQbULEk6Br0GfWbOfotNl/fQN4GbT7QoSVL9+AgESSqcQS9JhTPoJalwPtTsGHQs+H6zS6jJi0ObXYGkdxJH9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFa4hQR8RV0bEcxGxJSIWNOIckqTa1D3oI6IFuBf4KHAOMDsizqn3eSRJtWnEiP4CYEtm/jYz9wHLgKsbcB5JUg1aG3DM0cDWw9a7gMlHdoqIecC8anVPRDzXgFoGpIBRwM5m19Gr26PZFaiP+b1Zd2fW0qkRQd/Tv1Ae1ZC5CFjUgPMPeBGxNjMnNrsO6Uh+bzZHI6ZuuoAxh623A9sacB5JUg0aEfRPAeMiYmxEDAZmAasacB5JUg3qPnWTmfsj4hbgR0AL8M3M3FDv8+htOSWmdyq/N5sgMo+aPpckFcQ7YyWpcAa9JBXOoJekwjXiOnr1oYj4C7rvPB5N9/0K24BVmbmpqYVJesdwRN+PRcTn6X7ERABP0n1pawBLfZicpIO86qYfi4j/AsZn5htHtA8GNmTmuOZUJr29iPhEZv5rs+sYKBzR929vAqf30H5atU16p7q92QUMJM7R92+fBR6LiM386UFyZwBnAbc0rSoJiIj1b7UJeF9f1jLQOXXTz0XESXQ/Gno03f+BuoCnMvNAUwvTgBcRvwemAq8euQl4IjN7+m1UDeCIvp/LzDeBXzS7DqkHq4FhmbnuyA0Rsabvyxm4HNFLUuH8Y6wkFc6gl6TCGfQa8CLiHyNiQ0Ssj4h1EXHUR19K/Zl/jNWAFhEXAVcBH8nMP0bEKGBwk8uS6soRvQa604CdmflHgMzcmZnbImJCRDweEZ0R8aOIOC0iWiPiqYiYAhAR/xQRdzSzeKkWXnWjAS0ihgE/Bf4M+A9gOfAE8DhwdWbuiIiZwNTMvDEixgPfAz4D3AlMzsx9zaleqo1TNxrQMnNPREwALgH+iu6g/yLwl8CjEQHdH4m5veq/ISK+DTwMXGTIqz8w6DXgVXcRrwHWRMSvgZvpfijcRW+xy7nAa3gbv/oJ5+g1oEXEn0fE4U/5PB/YBLRVf6glIgZVUzZExN8AI4FLgbsjYkRf1ywdK+foNaBV0zb3ACOA/cAWYB7QDtwNDKf7N9+vAv9O9/z95Zm5NSI+A0zIzDnNqF2qlUEvSYVz6kaSCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpML9H7sxQ6xCeXsYAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sex.plot.bar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结论：男人在灾难中，存货率最高。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Survived</th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>isalone</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>175</td>\n",
       "      <td>179</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>374</td>\n",
       "      <td>163</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Survived    0    1\n",
       "isalone           \n",
       "0         175  179\n",
       "1         374  163"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#isalone\n",
    "isalone = pd.crosstab(train2.isalone,train2.Survived)\n",
    "isalone"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x2808ffac588>"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEGCAYAAABrQF4qAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAFVBJREFUeJzt3X+QVeWd5/H314YBS/wRoTVIE5uMpEZchcRGzVq6rMn6g0mhUxsEpkpJNNtO1FpSNTM1OlspsWqtSrLOWPnhWpLVgUxUJHFmII5xxnFi3MT4o9tBIqArCW5oobRBw4S4KuJ3/+iD9mDDvfTt2y1Pv19Vt+69z3mec76Hgk+ffnjOvZGZSJLKddhIFyBJai6DXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klS4MSNdAMCkSZOyvb19pMuQpENKd3f39sxsrdXvAxH07e3tdHV1jXQZknRIiYj/W08/p24kqXAGvSQVzqCXpMJ9IOboJWmo7d69m56eHt54442RLqVh48ePp62tjbFjxw5qvEEvqUg9PT0ceeSRtLe3ExEjXc6gZSY7duygp6eHadOmDWofTt1IKtIbb7zBxIkTD+mQB4gIJk6c2NBvJga9pGId6iG/V6PnYdBLUuGco5cK0H7d3490CXV58Su/P9IlcNNNN3H33XfT0tLCYYcdxu23386ZZ57Z0D7XrFnDhg0buO666xqub8KECezatavh/fRn0EsaNX72s59x//338/TTTzNu3Di2b9/OW2+9VdfYt99+mzFjBo7MefPmMW/evKEsdUg5dSNp1Ni2bRuTJk1i3LhxAEyaNIkTTjiB9vZ2tm/fDkBXVxdz5swBYOnSpXR2dnL++edz+eWXc+aZZ7J+/fp39zdnzhy6u7tZvnw51157LTt37qS9vZ133nkHgNdff52pU6eye/dufvGLX3DhhRdy+umnc8455/Dcc88BsHnzZj75yU8ye/ZsvvzlLzflvA16SaPG+eefz5YtW/jYxz7G1VdfzY9//OOaY7q7u1m9ejV33303CxcuZNWqVUDfD42tW7dy+umnv9v36KOPZubMme/u9wc/+AEXXHABY8eOpbOzk29+85t0d3dz8803c/XVVwOwZMkSvvjFL/LUU0/x4Q9/uAlnbdBLGkUmTJhAd3c3y5Yto7W1lQULFrB8+fIDjpk3bx6HH344AJdeeinf+973AFi1ahXz589/X/8FCxZw7733ArBy5UoWLFjArl27eOyxx5g/fz6zZs3iqquuYtu2bQD89Kc/ZdGiRQBcdtllQ3Wq/4Zz9JJGlZaWFubMmcOcOXM49dRTWbFiBWPGjHl3umXf9epHHHHEu6+nTJnCxIkTWbduHffeey+33377+/Y/b948rr/+el599VW6u7s577zz+O1vf8sxxxzD2rVrB6yp2ctAvaKXNGo8//zzvPDCC+++X7t2LSeeeCLt7e10d3cDcN999x1wHwsXLuRrX/saO3fu5NRTT33f9gkTJnDGGWewZMkSPvOZz9DS0sJRRx3FtGnT3v1tIDN55plnADj77LNZuXIlAHfdddeQnOe+DHpJo8auXbtYvHgxM2bM4LTTTmPDhg0sXbqUG264gSVLlnDOOefQ0tJywH189rOfZeXKlVx66aX77bNgwQK++93vsmDBgnfb7rrrLu644w5mzpzJKaecwurVqwH4+te/zq233srs2bPZuXPn0JzoPiIzm7Ljg9HR0ZF+8Yg0eK6jf7+NGzdy8sknD9vxmm2g84mI7szsqDXWK3pJKlzNoI+I8RHxZEQ8ExHrI+LGqn15RGyOiLXVY1bVHhHxjYjYFBHrIuITzT4JSdL+1bPq5k3gvMzcFRFjgZ9ExA+rbX+amd/fp/9FwPTqcSZwW/UsSRoBNa/os8/eD14YWz0ONLF/MfCdatzjwDERMbnxUiVJg1HXHH1EtETEWuAV4KHMfKLadFM1PXNLRIyr2qYAW/oN76naJEkjoK6gz8w9mTkLaAPOiIh/B1wP/B4wGzgW+LOq+0Ar/9/3G0BEdEZEV0R09fb2Dqp4SVJtB3VnbGb+OiIeAS7MzJur5jcj4q+AP6ne9wBT+w1rA7YOsK9lwDLoW155kHVLUsOGellqPctHH3zwQZYsWcKePXv4whe+MCQfbVxLPatuWiPimOr14cCngef2zrtH3727lwDPVkPWAJdXq2/OAnZm5ramVC9Jh5A9e/ZwzTXX8MMf/pANGzZwzz33sGHDhqYft54r+snAiohooe8Hw6rMvD8i/jkiWumbqlkL/FHV/wFgLrAJeB34/NCXLUmHnieffJKTTjqJj370o0DfxymsXr2aGTNmNPW4NYM+M9cBHx+g/bz99E/gmsZLk6SyvPTSS0yd+t7MdltbG0888cQBRgwN74yVpGEy0EfODMcXmBv0kjRM2tra2LLlvdXnPT09nHDCCU0/rkEvScNk9uzZvPDCC2zevJm33nqLlStXDst3zfrFI5JGreH8NE2AMWPG8K1vfYsLLriAPXv2cMUVV3DKKac0/7hNP4Ik6V1z585l7ty5w3pMp24kqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4VxeKWn0Wnr0EO9vZ80uV1xxBffffz/HHXcczz77bM3+Q8ErekkaRp/73Od48MEHh/WYBr0kDaNzzz2XY489dliPadBLUuEMekkqnEEvSYUz6CWpcC6vlDR61bEccqgtWrSIRx55hO3bt9PW1saNN97IlVde2dRj1gz6iBgPPAqMq/p/PzNviIhpwErgWOBp4LLMfCsixgHfAU4HdgALMvPFJtUvSYeUe+65Z9iPWc/UzZvAeZk5E5gFXBgRZwFfBW7JzOnAa8DeH0lXAq9l5knALVU/SdIIqRn02WdX9XZs9UjgPOD7VfsK4JLq9cXVe6rtn4rh+PZbSdKA6vrP2IhoiYi1wCvAQ8AvgF9n5ttVlx5gSvV6CrAFoNq+E5g4wD47I6IrIrp6e3sbOwtJGkBmjnQJQ6LR86gr6DNzT2bOAtqAM4CTB+pWPQ909f6+KjNzWWZ2ZGZHa2trvfVKUl3Gjx/Pjh07Dvmwz0x27NjB+PHjB72Pg1p1k5m/johHgLOAYyJiTHXV3gZsrbr1AFOBnogYAxwNvDroCiVpENra2ujp6aGEGYPx48fT1tY26PH1rLppBXZXIX848Gn6/oP1R8Bn6Vt5sxhYXQ1ZU73/WbX9n/NQ/5Eq6ZAzduxYpk2bNtJlfCDUc0U/GVgRES30TfWsysz7I2IDsDIi/jvwL8AdVf87gL+OiE30XckvbELdkqQ61Qz6zFwHfHyA9l/SN1+/b/sbwPwhqU6S1DA/AkGSCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqXM2gj4ipEfGjiNgYEesjYknVvjQiXoqItdVjbr8x10fEpoh4PiIuaOYJSJIOrOaXgwNvA3+cmU9HxJFAd0Q8VG27JTNv7t85ImYAC4FTgBOAf4qIj2XmnqEsXJJUn5pX9Jm5LTOfrl7/BtgITDnAkIuBlZn5ZmZuBjYBZwxFsZKkg3dQc/QR0Q58HHiiaro2ItZFxJ0R8aGqbQqwpd+wHgb4wRARnRHRFRFdvb29B124JKk+dQd9REwA7gO+lJn/CtwG/C4wC9gG/MXergMMz/c1ZC7LzI7M7GhtbT3owiVJ9akr6CNiLH0hf1dm/g1AZr6cmXsy8x3g27w3PdMDTO03vA3YOnQlS5IORj2rbgK4A9iYmX/Zr31yv25/ADxbvV4DLIyIcRExDZgOPDl0JUuSDkY9q27OBi4Dfh4Ra6u2PwcWRcQs+qZlXgSuAsjM9RGxCthA34qda1xxI0kjp2bQZ+ZPGHje/YEDjLkJuKmBuiRJQ8Q7YyWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFa6eLwefGhE/ioiNEbE+IpZU7cdGxEMR8UL1/KGqPSLiGxGxKSLWRcQnmn0SkqT9q+eK/m3gjzPzZOAs4JqImAFcBzycmdOBh6v3ABcB06tHJ3DbkFctSapbzaDPzG2Z+XT1+jfARmAKcDGwouq2Ariken0x8J3s8zhwTERMHvLKJUl1Oag5+ohoBz4OPAEcn5nboO+HAXBc1W0KsKXfsJ6qTZI0AuoO+oiYANwHfCkz//VAXQdoywH21xkRXRHR1dvbW28ZkqSDVFfQR8RY+kL+rsz8m6r55b1TMtXzK1V7DzC13/A2YOu++8zMZZnZkZkdra2tg61fklRDPatuArgD2JiZf9lv0xpgcfV6MbC6X/vl1eqbs4Cde6d4JEnDb0wdfc4GLgN+HhFrq7Y/B74CrIqIK4FfAfOrbQ8Ac4FNwOvA54e0YknSQakZ9Jn5Ewaedwf41AD9E7imwbokSUPEO2MlqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4epZR69DzdKjR7qC+izdOdIVSKOCV/SSVDiDXpIK59TNQWi/7u9HuoS6vDh+pCuQ9EHiFb0kFc6gl6TCGfSSVDiDXpIKZ9BLUuFcdSNp+Hgz34jwil6SCmfQS1LhDHpJKlzNoI+IOyPilYh4tl/b0oh4KSLWVo+5/bZdHxGbIuL5iLigWYVLkupTzxX9cuDCAdpvycxZ1eMBgIiYASwETqnG/M+IaBmqYiVJB69m0Gfmo8Crde7vYmBlZr6ZmZuBTcAZDdQnSWpQI3P010bEumpq50NV2xRgS78+PVXb+0REZ0R0RURXb29vA2VIkg5ksEF/G/C7wCxgG/AXVXsM0DcH2kFmLsvMjszsaG1tHWQZkqRaBhX0mflyZu7JzHeAb/Pe9EwPMLVf1zZga2MlSpIaMaigj4jJ/d7+AbB3Rc4aYGFEjIuIacB04MnGSpQkNaLmRyBExD3AHGBSRPQANwBzImIWfdMyLwJXAWTm+ohYBWwA3gauycw9zSldklSPmkGfmYsGaL7jAP1vAm5qpChJ0tDxzlhJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcDWDPiLujIhXIuLZfm3HRsRDEfFC9fyhqj0i4hsRsSki1kXEJ5pZvCSptnqu6JcDF+7Tdh3wcGZOBx6u3gNcBEyvHp3AbUNTpiRpsGoGfWY+Cry6T/PFwIrq9Qrgkn7t38k+jwPHRMTkoSpWknTwBjtHf3xmbgOono+r2qcAW/r166na3iciOiOiKyK6ent7B1mGJKmWof7P2BigLQfqmJnLMrMjMztaW1uHuAxJ0l6DDfqX907JVM+vVO09wNR+/dqArYMvT5LUqMEG/RpgcfV6MbC6X/vl1eqbs4Cde6d4JEkjY0ytDhFxDzAHmBQRPcANwFeAVRFxJfArYH7V/QFgLrAJeB34fBNqliQdhJpBn5mL9rPpUwP0TeCaRouSJA0d74yVpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuEMekkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klS4ml8leCAR8SLwG2AP8HZmdkTEscC9QDvwInBpZr7WWJmSpMEaiiv6/5iZszKzo3p/HfBwZk4HHq7eS5JGSDOmbi4GVlSvVwCXNOEYkqQ6NRr0CfxjRHRHRGfVdnxmbgOono9r8BiSpAY0NEcPnJ2ZWyPiOOChiHiu3oHVD4ZOgI985CMNliFJ2p+Grugzc2v1/Arwt8AZwMsRMRmgen5lP2OXZWZHZna0trY2UoYk6QAGHfQRcUREHLn3NXA+8CywBlhcdVsMrG60SEnS4DUydXM88LcRsXc/d2fmgxHxFLAqIq4EfgXMb7xMSdJgDTroM/OXwMwB2ncAn2qkKEnS0PHOWEkqnEEvSYUz6CWpcAa9JBXOoJekwhn0klQ4g16SCmfQS1LhDHpJKpxBL0mFM+glqXAGvSQVzqCXpMIZ9JJUOINekgpn0EtS4Qx6SSqcQS9JhTPoJalwTQv6iLgwIp6PiE0RcV2zjiNJOrCmBH1EtAC3AhcBM4BFETGjGceSJB1Ys67ozwA2ZeYvM/MtYCVwcZOOJUk6gDFN2u8UYEu/9z3Amf07REQn0Fm93RURzzepllEnYBKwfaTrqOnGGOkKNMz8uznkTqynU7OCfqA/pfw3bzKXAcuadPxRLSK6MrNjpOuQ9uXfzZHRrKmbHmBqv/dtwNYmHUuSdADNCvqngOkRMS0ifgdYCKxp0rEkSQfQlKmbzHw7Iq4F/gFoAe7MzPXNOJYG5JSYPqj8uzkCIjNr95IkHbK8M1aSCmfQS1LhDHpJKlyz1tFrGEXE79F35/EU+u5X2AqsycyNI1qYpA8Er+gPcRHxZ/R9xEQAT9K3tDWAe/wwOUngqptDXkT8H+CUzNy9T/vvAOszc/rIVCbtX0R8PjP/aqTrGC28oj/0vQOcMED75Gqb9EF040gXMJo4R3/o+xLwcES8wHsfJPcR4CTg2hGrSqNeRKzb3ybg+OGsZbRz6qYAEXEYfR8NPYW+f0Q9wFOZuWdEC9OoFhEvAxcAr+27CXgsMwf6TVRN4BV9ATLzHeDxka5D2sf9wITMXLvvhoh4ZPjLGb28opekwvmfsZJUOINekgpn0KtoEfHYIMc9EhF+E5KKYNCraJn570e6BmmkGfQqWkTsqp4nR8SjEbE2Ip6NiHOq9tsioisi1kfEgDfxRMSiiPh5Ne6r/fcdETdFxDMR8XhEHF+1t0bEfRHxVPU4ezjOVdofg16jxR8C/5CZs4CZwN4lf/+t+rLq04D/EBGn9R8UEScAXwXOA2YBsyPikmrzEcDjmTkTeBT4L1X714FbMnM28J+B/9W805Jqcx29RoungDsjYizwd/3Wdl8aEZ30/VuYDMwA+t/RORt4JDN7ASLiLuBc4O+At+hbKw7QDfyn6vWngRkRsXcfR0XEkZn5m6acmVSDQa9RITMfjYhzgd8H/joi/gfwv4E/AWZn5msRsRwYv8/QYP9253s3ouzhvX9PhwGfzMz/N2QnIDXAqRuNChFxIvBKZn4buAP4BHAU8FtgZzW/ftEAQ5+gb0pnUkS0AIuAH9c43D/S73OGImLWEJyCNGhe0Wu0mAP8aUTsBnYBl2fm5oj4F2A98Evgp/sOysxtEXE98CP6ru4fyMzVNY71X4Fbqw/1GkPf/P0fDdmZSAfJj0CQpMI5dSNJhTPoJalwBr0kFc6gl6TCGfSSVDiDXpIKZ9BLUuH+P18frvZ6J0PrAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "isalone.plot.bar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结论：孤身一人在灾难面前，存活率较低"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Survived</th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>familysize</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>374</td>\n",
       "      <td>163</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>72</td>\n",
       "      <td>89</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>43</td>\n",
       "      <td>59</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>8</td>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>12</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>19</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>8</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Survived      0    1\n",
       "familysize          \n",
       "1           374  163\n",
       "2            72   89\n",
       "3            43   59\n",
       "4             8   21\n",
       "5            12    3\n",
       "6            19    3\n",
       "7             8    4\n",
       "8             6    0\n",
       "11            7    0"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "familysize = pd.crosstab(train2.familysize,train2.Survived)\n",
    "familysize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x2809001af60>"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAENCAYAAAABh67pAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAGUJJREFUeJzt3X+UVOWd5/H3Jw0DRgwqtAZoYpMRHUGllQZ1XT2MZgRJDpo5IpA5gj827UTYkJPZ3dHMyUpmlhmTdeLmh+OGCQpu1BY1DmgcI6PRrNGo3QZRRCKKKy1EW4xEdPwBfPePum0qWHQVXVVd3Q+f1zl1quq5z733W/z49O2nnnuvIgIzM0vXx2pdgJmZVZeD3swscQ56M7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBLnoDczS5yD3swscQNqXQDA8OHDo7GxsdZlmJn1K+3t7a9HRH2xfn0i6BsbG2lra6t1GWZm/Yqk/1dKPw/dmJklzkFvZpY4B72ZWeL6xBi9mVmlffDBB3R0dPDuu+/WupSyDR48mIaGBgYOHNij9R30Zpakjo4ODjroIBobG5FU63J6LCLYtm0bHR0djBkzpkfb8NCNmSXp3XffZdiwYf065AEkMWzYsLJ+M3HQm1my+nvIdyn3czjozWy/snjxYsaPH8/xxx9PU1MTjz32WNnbXLVqFVdddVUFqoMhQ4ZUZDv5+tUYfePlPyna56WrPtsLlZhZf/Too49y99138+STTzJo0CBef/113n///ZLW3blzJwMGFI7MGTNmMGPGjEqWWlE+ojez/cbWrVsZPnw4gwYNAmD48OGMHDmSxsZGXn/9dQDa2tqYMmUKAIsWLaKlpYWzzjqLuXPnctJJJ7Fu3boPtzdlyhTa29tZtmwZCxYsYPv27TQ2NrJ7924A3nnnHUaPHs0HH3zACy+8wLRp05g4cSKnnXYazz33HACbNm3ilFNOYdKkSXz961+vyud20JvZfuOss85i8+bNHHXUUVx22WU89NBDRddpb29n5cqV3HzzzcyePZsVK1YAuR8aW7ZsYeLEiR/2HTp0KBMmTPhwu3fddRdTp05l4MCBtLS08L3vfY/29nauvvpqLrvsMgAWLlzIl770JZ544gk++clPVuFTO+jNbD8yZMgQ2tvbWbJkCfX19cyaNYtly5Z1u86MGTM44IADADj//PO57bbbAFixYgUzZ878SP9Zs2Zx6623AtDa2sqsWbPYsWMHjzzyCDNnzqSpqYlLL72UrVu3AvCLX/yCOXPmAHDBBRdU6qP+gX41Rm9mVq66ujqmTJnClClTOO6441i+fDkDBgz4cLhlz2mMBx544IevR40axbBhw1i7di233norP/jBDz6y/RkzZnDFFVfwxhtv0N7ezhlnnMHbb7/NwQcfzJo1awrWVO3ZQT6iN7P9xoYNG3j++ec/fL9mzRqOOOIIGhsbaW9vB+COO+7odhuzZ8/mW9/6Ftu3b+e44477yPIhQ4YwefJkFi5cyOc+9znq6ur4xCc+wZgxYz78bSAieOqppwA49dRTaW1tBeCmm26qyOfck4PezPYbO3bsYN68eYwbN47jjz+eZ599lkWLFnHllVeycOFCTjvtNOrq6rrdxnnnnUdrayvnn3/+XvvMmjWLH/3oR8yaNevDtptuuomlS5cyYcIExo8fz8qVKwH4zne+w7XXXsukSZPYvn17ZT7oHhQRVdnwvmhubo5Srkfv6ZVmVqr169dzzDHH1LqMiin0eSS1R0RzsXV9RG9mlriiQS9psKTHJT0laZ2kb2TtyyRtkrQmezRl7ZL0XUkbJa2VdGK1P4SZme1dKbNu3gPOiIgdkgYCD0v612zZf42I2/fofzYwNnucBFyXPZuZWQ0UPaKPnB3Z24HZo7uB/XOAG7P1fgkcLGlE+aWamVlPlDRGL6lO0hrgNWB1RHRdBWhxNjxzjaRBWdsoYHPe6h1Zm5mZ1UBJQR8RuyKiCWgAJks6FrgC+BNgEnAo8NdZ90Iz/z/yG4CkFkltkto6Ozt7VLyZmRW3T7NuIuJN4EFgWkRszYZn3gNuACZn3TqA0XmrNQBbCmxrSUQ0R0RzfX19j4o3M+tv7r33Xo4++miOPPLIil3auJiiX8ZKqgc+iIg3JR0AfAb4pqQREbFVuXN3zwWeyVZZBSyQ1EruS9jtEbG1SvWbmfVYKefm7Iti5/Hs2rWL+fPns3r1ahoaGpg0aRIzZsxg3LhxFa1jT6XMuhkBLJdUR+43gBURcbekB7IfAgLWAH+Z9b8HmA5sBN4BLqp82WZm/c/jjz/OkUceyac//WkgdzmFlStX1j7oI2ItcEKB9jP20j+A+eWXZmaWlldeeYXRo38/st3Q0FCRO1wV4zNjzcx6SaFLzvTGfW0d9GZmvaShoYHNm38/+7yjo4ORI0dWfb8OejOzXjJp0iSef/55Nm3axPvvv09ra2uv3GvWNx4xM+slAwYM4Pvf/z5Tp05l165dXHzxxYwfP776+636HszM+qhaXNZ8+vTpTJ8+vVf36aEbM7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBLnoDcz60UXX3wxhx12GMcee2yv7dPz6M1s/7VoaIW3t71olwsvvJAFCxYwd+7cyu67Gz6iNzPrRaeffjqHHnpor+7TQW9mljgHvZlZ4hz0ZmaJc9CbmSXOQW9m1ovmzJnDKaecwoYNG2hoaGDp0qVV32fR6ZWSBgM/BwZl/W+PiCsljQFagUOBJ4ELIuJ9SYOAG4GJwDZgVkS8VKX6zcx6roTpkJV2yy239Po+Szmifw84IyImAE3ANEknA98EromIscBvgUuy/pcAv42II4Frsn5mZlYjRYM+cnZkbwdmjwDOAG7P2pcD52avz8neky0/U71x91szMyuopDF6SXWS1gCvAauBF4A3I2Jn1qUDGJW9HgVsBsiWbweGVbJoMzMrXUlBHxG7IqIJaAAmA8cU6pY9Fzp6jz0bJLVIapPU1tnZWWq9ZmYli/hI9PRL5X6OfZp1ExFvAg8CJwMHS+r6MrcB2JK97gBGA2TLhwJvFNjWkohojojm+vr6nlVvZrYXgwcPZtu2bf0+7COCbdu2MXjw4B5vo5RZN/XABxHxpqQDgM+Q+4L1Z8B55GbezANWZqusyt4/mi1/IPr7n7SZ9TsNDQ10dHSQwojB4MGDaWho6PH6pVy9cgSwXFIdud8AVkTE3ZKeBVol/Q/gV0DXZNClwP+RtJHckfzsHldnZtZDAwcOZMyYMbUuo08oGvQRsRY4oUD7i+TG6/dsfxeYWZHqzMysbD4z1swscQ56M7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBLnoDczS5yD3swscQ56M7PEOejNzBJXNOgljZb0M0nrJa2TtDBrXyTpFUlrssf0vHWukLRR0gZJU6v5AczMrHtFbw4O7AT+KiKelHQQ0C5pdbbsmoi4Or+zpHHAbGA8MBL4N0lHRcSuShZuZmalKXpEHxFbI+LJ7PVbwHpgVDernAO0RsR7EbEJ2AhMrkSxZma27/ZpjF5SI3AC8FjWtEDSWknXSzokaxsFbM5brYPufzCYmVkVlRz0koYAdwBfiYjfAdcBfww0AVuBf+zqWmD1KLC9Fkltkto6Ozv3uXAzMytNSUEvaSC5kL8pIn4MEBGvRsSuiNgN/DO/H57pAEbnrd4AbNlzmxGxJCKaI6K5vr6+nM9gZmbdKGXWjYClwPqI+HZe+4i8bp8HnslerwJmSxokaQwwFni8ciWbmdm+KGXWzanABcDTktZkbV8D5khqIjcs8xJwKUBErJO0AniW3Iyd+Z5xY2ZWO0WDPiIepvC4+z3drLMYWFxGXWZmViE+M9bMLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxJVyc/DRkn4mab2kdZIWZu2HSlot6fns+ZCsXZK+K2mjpLWSTqz2hzAzs70r5Yh+J/BXEXEMcDIwX9I44HLg/ogYC9yfvQc4GxibPVqA6ypetZmZlaxo0EfE1oh4Mnv9FrAeGAWcAyzPui0Hzs1enwPcGDm/BA6WNKLilZuZWUn2aYxeUiNwAvAYcHhEbIXcDwPgsKzbKGBz3modWZuZmdVAyUEvaQhwB/CViPhdd10LtEWB7bVIapPU1tnZWWoZZma2j0oKekkDyYX8TRHx46z51a4hmez5tay9Axidt3oDsGXPbUbEkohojojm+vr6ntZvZmZFlDLrRsBSYH1EfDtv0SpgXvZ6HrAyr31uNvvmZGB71xCPmZn1vgEl9DkVuAB4WtKarO1rwFXACkmXAC8DM7Nl9wDTgY3AO8BFFa3YzMz2SdGgj4iHKTzuDnBmgf4BzC+zLjMzqxCfGWtmljgHvZlZ4hz0ZmaJc9CbmSXOQW9mljgHvZlZ4hz0ZmaJc9CbmSXOQW9mljgHvZlZ4hz0ZmaJc9CbmSXOQW9mljgHvZlZ4hz0ZmaJc9CbmSWulDtM9S+LhpbQZ3v16zAz6yN8RG9mljgHvZlZ4ooGvaTrJb0m6Zm8tkWSXpG0JntMz1t2haSNkjZImlqtws3MrDSlHNEvA6YVaL8mIpqyxz0AksYBs4Hx2Tr/JKmuUsWamdm+Kxr0EfFz4I0St3cO0BoR70XEJmAjMLmM+szMrEzljNEvkLQ2G9o5JGsbBWzO69ORtZmZWY30NOivA/4YaAK2Av+YtatA3yi0AUktktoktXV2dvawDDMzK6ZHQR8Rr0bErojYDfwzvx+e6QBG53VtALbsZRtLIqI5Iprr6+t7UoaZmZWgR0EvaUTe288DXTNyVgGzJQ2SNAYYCzxeXolmZlaOomfGSroFmAIMl9QBXAlMkdREbljmJeBSgIhYJ2kF8CywE5gfEbuqU7qZmZWiaNBHxJwCzUu76b8YWFxOUWZmVjk+M9bMLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwSVzToJV0v6TVJz+S1HSpptaTns+dDsnZJ+q6kjZLWSjqxmsWbmVlxpRzRLwOm7dF2OXB/RIwF7s/eA5wNjM0eLcB1lSnTzMx6qmjQR8TPgTf2aD4HWJ69Xg6cm9d+Y+T8EjhY0ohKFWtmZvuup2P0h0fEVoDs+bCsfRSwOa9fR9ZmZmY1UukvY1WgLQp2lFoktUlq6+zsrHAZZmbWpadB/2rXkEz2/FrW3gGMzuvXAGwptIGIWBIRzRHRXF9f38MyzMysmJ4G/SpgXvZ6HrAyr31uNvvmZGB71xCPmZnVxoBiHSTdAkwBhkvqAK4ErgJWSLoEeBmYmXW/B5gObATeAS6qQs1mZrYPigZ9RMzZy6IzC/QNYH65RZmZWeX4zFgzs8QVPaK3Clg0tIQ+26tfh5ntl3xEb2aWOAe9mVniHPRmZolz0JuZJc5Bb2aWOAe9mVniHPRmZolz0JuZJc4nTJWp8fKfFO3z0uBeKMTMbC98RG9mljgHvZlZ4hz0ZmaJc9CbmSXOQW9mljgHvZlZ4jy9cn9VyjXywdfJN0tAWUEv6SXgLWAXsDMimiUdCtwKNAIvAedHxG/LK9PMzHqqEkM3fxoRTRHRnL2/HLg/IsYC92fvzcysRqoxRn8OsDx7vRw4twr7MDOzEpUb9AHcJ6ldUkvWdnhEbAXIng8rcx9mZlaGcr+MPTUitkg6DFgt6blSV8x+MLQAfOpTnyqzDDMz25uyjugjYkv2/BpwJzAZeFXSCIDs+bW9rLskIpojorm+vr6cMszMrBs9DnpJB0o6qOs1cBbwDLAKmJd1mwesLLdIMzPruXKGbg4H7pTUtZ2bI+JeSU8AKyRdArwMzCy/TDMz66keB31EvAhMKNC+DTiznKKsPL5Gvpnl8yUQzMwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxDnozcwS56A3M0ucg97MLHG+Obj1Hb5huVlV+IjezCxxPqI362NKuvroVZ/thUosFT6iNzNLnIPezCxxHrqx/VYpQyTgYRLr/3xEb2aWOAe9mVniqjZ0I2ka8B2gDvhhRFxVrX1Z3+f72PZvnglUmr7651SVoJdUB1wL/BnQATwhaVVEPFuN/ZnZ/qevhmpfVK0j+snAxoh4EUBSK3AO4KC3/qeUM3Z7+2xd12T7oFpj9KOAzXnvO7I2MzPrZYqIym9UmglMjYj/lL2/AJgcEf85r08L0JK9PRrYUKHdDwder9C2KsU1laYv1gR9sy7XVJrUazoiIuqLdarW0E0HMDrvfQOwJb9DRCwBllR6x5LaIqK50tsth2sqTV+sCfpmXa6pNK4pp1pDN08AYyWNkfRHwGxgVZX2ZWZm3ajKEX1E7JS0APgpuemV10fEumrsy8zMule1efQRcQ9wT7W2342KDwdVgGsqTV+sCfpmXa6pNK6JKn0Za2ZmfYcvgWBmljgHvZlZ4hz0+wlJkyVNyl6Pk/RVSdNrXVc+STfWugbrvyT9kaS5kj6Tvf+CpO9Lmi9pYK3rqyWP0VeBpD8hdybwYxGxI699WkTcW4N6rgTOJvfl+2rgJOBB4DPATyNicQ1q2nO6rYA/BR4AiIgZvV3TniT9R3KX83gmIu6rYR0nAesj4neSDgAuB04kd0mRv4+IXr+ugKQvA3dGxOainXuJpJvI/Rv/OPAmMAT4MXAmuaybV8PyairZoJd0UUTcUIP9fhmYD6wHmoCFEbEyW/ZkRJxYg5qezmoZBPwGaMgLjcci4vga1PQkuaD6IRDkgv4WcudcEBEP1aCmxyNicvb6i+T+Hu8EzgLuqtUVWCWtAyZk05aXAO8At5MLsAkR8ec1qGk78DbwArm/t9siorO369ijprURcbykAcArwMiI2CVJwFO1+HfeV6Q8dPONGu33i8DEiDgXmAJ8XdLCbJlqVNPOiNgVEe8AL0TE7wAi4t+B3TWqqRloB/4G2B4RDwL/HhEP1SLkM/m/3rcAfxYR3yAX9H9Rm5IA+FhE7MxeN0fEVyLi4ay2T9eophfJnfH+d8BE4FlJ90qaJ+mgGtX0sewEzYPIHdV3XWVtEH/4d9tnSPrX3thPv76VoKS1e1sEHN6bteSp6xquiYiXJE0Bbpd0BLUL+vclfTwL+oldjZKGUqOgj4jdwDWSbsueX6X2/x4/JukQcgdA6jpCjYi3Je3sftWqeibvN9SnJDVHRJuko4APalRTZH+H9wH3ZWPgZwNzgKuBotdfqYKlwHPkTtL8G+A2SS8CJwOtNagHAEl7+y1e5H7Trn4N/XnoJguHqcBv91wEPBIRI2tQ0wPAVyNiTV7bAOB64C8ioq4GNQ2KiPcKtA8HRkTE071dU4FaPgucGhFfq2ENL5H7wSdyw0n/ISJ+I2kI8HBE9Mp/ygJ1DSV3E5/TyF0M60RyV4fdDHw5Ip6qQU2/iogT9rLsgOy3xV4naSRARGyRdDC576FejojHa1FPVtMu4CEKH+idHBEHVL2Gfh70S4EbIuLhAstujogv1KCmBnJDJb8psOzUiPhFb9dk5ZH0ceDwiNhU4zoOIjdUMwDoiIhXa1jLURHx61rtvz+R9Azw+Yh4vsCyzRExusBqla2hPwe9mVlfJ+k84OmI+Mil2CWdGxH/Uu0aaj0mamaWtIi4vZvFh/RGDT6iNzOrEUkvR8Snqr0fH9GbmVVRX5gd6KA3M6uuw+lmdmBvFOCgNzOrrruBIflTrrtIerA3CvAYvZlZ4lK+BIKZmeGgNzNLnoPe+j1JX5a0PrtMbTnb+du8a5k/KKm5B9v4oaRx5dRhVmkeo7d+T9JzwNmVvERB9iXZf4mItkpt06xWfERv/Zqk/03u+i+rJP21pEck/Sp7Pjrrc6Gkf5F0l6RNkhZkd9j6laRfSjo067csO109f/uXSLom7/0XJX1b0oGSfiLpKUnPSJqVLX9QUrOkGZLWZI8NkjZlyydKekhSu6SfShrRW39Wtv9y0Fu/FhF/CWwhd3eq64DTs6sq/nfg7/O6Hgt8gdwdoxYD72T9HgXmdrOLVmBG3q3oLgJuAKYBWyJiQkQcC/zBncMiYlVENGVXvHwKuDrbxveA8yJiIrkrmvb63b1s/+N59JaSocBySWPJXWY4/2YTP4uIt4C3srsj3ZW1Pw3s9c5D2bXoHwA+J2k9MDAinpb0Hrnw/iZwd0T830LrS/pv5G6ocq2kY8n9wFmdu+kRdcDWcj6wWSkc9JaSvyMX6J+X1Ejuvrhd8q/Hvzvv/W6K/z/4IfA1cje1uAEgIn4taSIwHfgHSfdFxN/mryTpTGAmcHpXE7AuIk7Zt49lVh4HvaVkKLl7hQJcWKmNRsRjkkaTu+HH8fDhDS7eiIgfSdqx5/6yO4r9EzAt7yYcG4B6SadExKPZUM5REbGuUrWaFeKgt5R8i9zQzVeBByq87RVAU0R0Xa/kOOB/StpN7nZ+X9qj/4XAMODObJhmS0RMz77s/W5216gBwP8CHPRWVZ5eaVYCSXcD10TE/bWuxWxfedaNWTckHSzp1+S+UHXIW7/kI3ozs8T5iN7MLHEOejOzxDnozcwS56A3M0ucg97MLHEOejOzxP1/XOWkBuKjJkYAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "familysize.plot.bar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结论：孤身一身或者家庭人员超过5人存活率都较低。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Survived</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>(1, 10]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>(70, 80]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>(1, 10]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>(50, 60]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>(1, 10]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Survived      Fare\n",
       "0         0   (1, 10]\n",
       "1         1  (70, 80]\n",
       "2         1   (1, 10]\n",
       "3         1  (50, 60]\n",
       "4         0   (1, 10]"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#不同船票价格区间存活率\n",
    "#利用面元划分\n",
    "#fare = pd.crosstab(train2.Fare,train2.Survived, margins=True)\n",
    "#fare.plot.bar() 分类太多不易比较，需要用到面元\n",
    "bins = [0, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 500,600]\n",
    "fare = pd.cut(train2.Fare,bins)\n",
    "\n",
    "temp = pd.DataFrame()\n",
    "temp['Survived'] = train2.Survived\n",
    "temp['Fare'] = fare\n",
    "temp.head()\n",
    "    \n",
    "# temp.loc[train2.Fare > 500]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x28091ab4390>"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAE0CAYAAAAi8viMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3X28VVW97/HPN0AxUVHAfNjoxtQSfKAE1GsW6U3NCsurgj2IWeFJLbun17mZt66e08uu9erJyjrRsQueVLSsoDTTY8+lAVsRUSQ0TLeSIhU+RQH+7h9zblhs9xN7jTX3XoPv+/Var732WHON35x7jf1bc4015hiKCMzMLF8vG+gdMDOzxnKiNzPLnBO9mVnmnOjNzDLnRG9mljknejOzzDnRm5llzonezCxzTvRmZplzojczy9zQgd4BgNGjR0dra+tA74aZWVNpa2t7OiLG9LbdoEj0ra2tLF68eKB3w8ysqUj6Y1+2c9eNmVnmnOjNzDLnRG9mlrlB0UdvZpbahg0baG9vZ/369QO9K3UbPnw4LS0tDBs2rF/Pd6I3syy1t7ezyy670NraiqSB3p1+iwjWrl1Le3s748aN61cd7roxsyytX7+eUaNGNXWSB5DEqFGj6vpk4kRvZtlq9iTfod7jcKI3s+3K5ZdfzoQJEzj88MOZOHEiv/vd7+quc8GCBVxxxRUJ9g5GjBiRpJ5avfbRSxoLXAPsBbwIzI6IKyVdBnwAWFNueklE3FI+5+PA+4BNwIcj4if93cHWi2/usvyRK97S3yrNbDt155138qMf/Yi7776bHXfckaeffpp//OMffXruxo0bGTq065Q5bdo0pk2blnJXk+rLGf1G4KMRcQhwNHCBpPHlY1+MiInlrSPJjwdmABOAk4GvSRrSgH03M9smq1evZvTo0ey4444AjB49mn322YfW1laefvppABYvXszUqVMBuOyyy5g1axYnnngiZ599NkcddRT333//5vqmTp1KW1sbc+bM4cILL2TdunW0trby4osvAvDCCy8wduxYNmzYwMMPP8zJJ5/MkUceyXHHHceDDz4IwKpVqzjmmGOYPHkyn/zkJxty3L0m+ohYHRF3l/efBZYD+/bwlFOBeRHx94hYBTwETEmxs2Zm9TjxxBN57LHHOPjggzn//PP5xS9+0etz2tramD9/Ptdddx0zZszgxhtvBIo3jSeeeIIjjzxy87a77bYbRxxxxOZ6f/jDH3LSSScxbNgwZs2axVe+8hXa2tr43Oc+x/nnnw/ARRddxAc/+EEWLVrEXnvt1YCj3sY+ekmtwGuAjk6tCyUtlfQtSbuXZfsCj9U8rZ2e3xjMzCoxYsQI2tramD17NmPGjGH69OnMmTOnx+dMmzaNnXbaCYAzzzyT73znOwDceOONnHHGGS/Zfvr06dxwww0AzJs3j+nTp/Pcc8/x29/+ljPOOIOJEydy3nnnsXr1agB+85vfcNZZZwHwnve8J9WhbqXP4+gljQBuAj4SEc9I+jrwKSDKn58HzgW6+no4uqhvFjALYL/99tv2PTcz64chQ4YwdepUpk6dymGHHcbcuXMZOnTo5u6WzsMYd95558339913X0aNGsXSpUu54YYb+MY3vvGS+qdNm8bHP/5x/vznP9PW1sbxxx/P888/z8iRI1myZEmX+9To0UF9OqOXNIwiyV8bEd8DiIgnI2JTRLwIfJMt3TPtwNiap7cAT3SuMyJmR8SkiJg0Zkyvs2yamdVtxYoVrFy5cvPvS5YsYf/996e1tZW2tjYAbrrpph7rmDFjBp/97GdZt24dhx122EseHzFiBFOmTOGiiy7irW99K0OGDGHXXXdl3Lhxmz8NRAT33nsvAMceeyzz5s0D4Nprr01ynJ31muhVvNVcDSyPiC/UlO9ds9k7gGXl/QXADEk7ShoHHAQsTLfLZmb989xzzzFz5kzGjx/P4YcfzgMPPMBll13GpZdeykUXXcRxxx3HkCE9jx05/fTTmTdvHmeeeWa320yfPp1vf/vbTJ8+fXPZtddey9VXX80RRxzBhAkTmD9/PgBXXnklV111FZMnT2bdunVpDrQTRbykV2XrDaTXAb8C7qMYXglwCXAWMJGiW+YR4LyIWF0+539TdONspOjq+XFPMSZNmhTdzUfv4ZVm1h/Lly/nkEMOGejdSKar45HUFhGTentur330EfFruu53v6WH51wOXN5b3WZm1ni+MtbMLHNO9GZmmXOiNzPLnBO9mVnmnOjNzDLnRG9mVqFbb72VV73qVRx44IHJpjbujZcSNLPtVnfX6fRXb9f3bNq0iQsuuIDbb7+dlpYWJk+ezLRp0xg/fnyPz6uXz+jNzCqycOFCDjzwQA444AB22GEHZsyYsfkK2UZyojczq8jjjz/O2LFbpgJraWnh8ccfb3hcJ3ozs4p0NeVMFevaOtGbmVWkpaWFxx7bslxHe3s7++yzT8PjOtGbmVVk8uTJrFy5klWrVvGPf/yDefPmVbLWrEfdmJlVZOjQoXz1q1/lpJNOYtOmTZx77rlMmDCh8XEbHsHMbJAaiOnOTznlFE455ZRKY7rrxswsc070ZmaZc6I3M8ucE72ZWeac6M3MMudEb2aWOSd6M7MKnXvuuey5554ceuihlcX0OHoz235dtlvi+tb1usk555zDhRdeyNlnn502dg98Rm9mVqHXv/717LHHHpXGdKI3M8ucE72ZWeac6M3MMudEb2aWOSd6M7MKnXXWWRxzzDGsWLGClpYWrr766obH9PBKM9t+9WE4ZGrXX3995TF9Rm9mljknejOzzDnRm5llrtdEL2mspJ9JWi7pfkkXleV7SLpd0sry5+5luSR9WdJDkpZKem2jD8LMrCsRMdC7kES9x9GXM/qNwEcj4hDgaOACSeOBi4E7IuIg4I7yd4A3AweVt1nA1+vaQzOzfhg+fDhr165t+mQfEaxdu5bhw4f3u45eR91ExGpgdXn/WUnLgX2BU4Gp5WZzgZ8DHyvLr4nir3uXpJGS9i7rMTOrREtLC+3t7axZs2agd6Vuw4cPp6Wlpd/P36bhlZJagdcAvwNe0ZG8I2K1pD3LzfYFHqt5WntZ5kRvZpUZNmwY48aNG+jdGBT6/GWspBHATcBHIuKZnjbtouwln50kzZK0WNLiHN5xzcwGqz4leknDKJL8tRHxvbL4SUl7l4/vDTxVlrcDY2ue3gI80bnOiJgdEZMiYtKYMWP6u/9mZtaLvoy6EXA1sDwivlDz0AJgZnl/JjC/pvzscvTN0cA698+bmQ2cvvTRHwu8B7hP0pKy7BLgCuBGSe8DHgXOKB+7BTgFeAh4AXhv0j02M7Nt0pdRN7+m6353gBO62D6AC+rcLzMzS8RXxpqZZc6J3swsc070ZmaZc6I3M8ucE72ZWeac6M3MMudEb2aWOSd6M7PMOdGbmWXOid7MLHNO9GZmmXOiNzPLnBO9mVnmnOjNzDLnRG9mljknejOzzDnRm5llzonezCxzTvRmZplzojczy5wTvZlZ5pzozcwy50RvZpY5J3ozs8w50ZuZZc6J3swsc070ZmaZc6I3M8ucE72ZWeac6M3MMudEb2aWuV4TvaRvSXpK0rKassskPS5pSXk7peaxj0t6SNIKSSc1asfNzKxv+nJGPwc4uYvyL0bExPJ2C4Ck8cAMYEL5nK9JGpJqZ83MbNv1mugj4pfAn/tY36nAvIj4e0SsAh4CptSxf2ZmVqd6+ugvlLS07NrZvSzbF3isZpv2sszMzAZIfxP914FXAhOB1cDny3J1sW10VYGkWZIWS1q8Zs2afu6GmZn1pl+JPiKejIhNEfEi8E22dM+0A2NrNm0BnuimjtkRMSkiJo0ZM6Y/u2FmZn3Qr0Qvae+aX98BdIzIWQDMkLSjpHHAQcDC+nbRzMzqMbS3DSRdD0wFRktqBy4FpkqaSNEt8whwHkBE3C/pRuABYCNwQURsasyum5lZX/Sa6CPirC6Kr+5h+8uBy+vZKTMzS8dXxpqZZc6J3swsc070ZmaZc6I3M8ucE72ZWeac6M3MMudEb2aWOSd6M7PMOdGbmWXOid7MLHNO9GZmmXOiNzPLnBO9mVnmnOjNzDLnRG9mljknejOzzDnRm5llzonezCxzTvRmZplzojczy5wTvZlZ5pzozcwy50RvZpY5J3ozs8w50ZuZZc6J3swsc070ZmaZc6I3M8ucE72ZWeac6M3MMudEb2aWOSd6M7PM9ZroJX1L0lOSltWU7SHpdkkry5+7l+WS9GVJD0laKum1jdx5MzPr3dA+bDMH+CpwTU3ZxcAdEXGFpIvL3z8GvBk4qLwdBXy9/DmotV58c7ePPXLFWyrcEzOz9Ho9o4+IXwJ/7lR8KjC3vD8XeHtN+TVRuAsYKWnvVDtrZmbbrr999K+IiNUA5c89y/J9gcdqtmsvy8zMbICk/jJWXZRFlxtKsyQtlrR4zZo1iXfDzMw69DfRP9nRJVP+fKosbwfG1mzXAjzRVQURMTsiJkXEpDFjxvRzN8zMrDf9TfQLgJnl/ZnA/Jrys8vRN0cD6zq6eMzMbGD0OupG0vXAVGC0pHbgUuAK4EZJ7wMeBc4oN78FOAV4CHgBeG8D9tnMzLZBr4k+Is7q5qETutg2gAvq3SkzM0vHV8aamWXOid7MLHNO9GZmmXOiNzPLnBO9mVnmnOjNzDLnRG9mljknejOzzDnRm5llzonezCxzTvRmZplzojczy1xf1oy1RLw2rZkNBJ/Rm5llzmf0vblst27K11W7H2Zm/eQzejOzzDnRm5llzonezCxzTvRmZplzojczy5wTvZlZ5pzozcwy17zj6Lsb3w4e425mVsNn9GZmmXOiNzPLXPN23eTGUy2YWYP4jN7MLHNO9GZmmXOiNzPLnBO9mVnmnOjNzDLnRG9mlrm6hldKegR4FtgEbIyISZL2AG4AWoFHgDMj4i/17aaZmfVXijP6N0bExIiYVP5+MXBHRBwE3FH+bmZmA6QRXTenAnPL+3OBtzcghpmZ9VG9iT6A2yS1SZpVlr0iIlYDlD/37OqJkmZJWixp8Zo1a+rcDTMz6069UyAcGxFPSNoTuF3Sg319YkTMBmYDTJo0KercDzMz60ZdZ/QR8UT58yng+8AU4ElJewOUP5+qdyfNzKz/+p3oJe0saZeO+8CJwDJgATCz3GwmML/enTQzs/6rp+vmFcD3JXXUc11E3CppEXCjpPcBjwJn1L+bZmbWX/1O9BHxB+CILsrXAifUs1NmZpaOr4w1M8ucE72ZWeac6M3MMudEb2aWOSd6M7PMOdGbmWXOid7MLHP1znVjg1DrxTd3Wf7IFW+peE/MbDDwGb2ZWeac6M3MMudEb2aWOSd6M7PMOdGbmWXOo26seV22Wzfl66rdD7NBzmf0ZmaZ8xm9mTUvf6rrEyf67Ul3/xTgfwyzjLnrxswsc070ZmaZc6I3M8uc++jNBgN/f2IN5DN6M7PMOdGbmWXOid7MLHPuozfbnvi7gO2SE731m1ey2nbd/s2GV7wjtl1xordBrbvECGmTY49x/Ma1fcvgU5D76M3MMudEb2aWOXfdmNmgVlX3Xc6c6M1646lwtws5f1HesK4bSSdLWiHpIUkXNyqOmZn1rCFn9JKGAFcBbwLagUWSFkTEA42IZ4NMBqMUrE7+FDSoNKrrZgrwUET8AUDSPOBUwInerAI5d0Nkq4EnSI3qutkXeKzm9/ayzMzMKqaISF+pdAZwUkS8v/z9PcCUiPhQzTazgFnlr68CVmxjmNHA0wl2d3uKk9Ox5BYnp2PJLc5gPpb9I2JMbxs1quumHRhb83sL8ETtBhExG5jd3wCSFkfEpP4+f3uMk9Ox5BYnp2PJLU4Ox9KorptFwEGSxknaAZgBLGhQLDMz60FDzugjYqOkC4GfAEOAb0XE/Y2IZWZmPWvYBVMRcQtwS6Pqp45un+04Tk7HklucnI4ltzhNfywN+TLWzMwGD09qZmaWOSd6M7PMNcWkZpJe24fNNkTEfXXGOa0Pm60vv3/ob4x/7sNmz0fEN/obo4zT8GMp41R1PJW0gZxU2Aayams5aoo+eknPUgzZVA+bjYuI1jrjrAXm9xLn9RHxyjpirAa+3kuMd0XEwf2NUcZp+LGUcao6nqrawDO9bQKsrud4qohRxqmqDWTV1qpQVRvo0BRn9MCiiDi+pw0k/TRBnB9HxLm9xPl2nTH+MyL+rZcYO9cZA6o5FqjueKpqAw9HxGt6iXNPE8SA6tpAbm2t41PKZ4A9KZKugIiIXVPUT3VtoKirGc7ozaoi6YCOyfjq2WagY1h9JD0EvC0iljeo/krbQNMkekm7ASdTTI4WFFMq/CQi/po4zqspZtqsjbMg5Qsu6STg7Z1izI+IW1PFKOM0/FjKOFUdTyVtoIz1ito4EfFkk8aoqg3k1tZ+ExHHpqyzmzgNbwPQJIle0tnApcBtwONlcQvFfPf/GhHXJIrzMeAsYB7FfD0dcWYA8yLiigQxvgQcDFzTKcbZwMqIuKjeGGWchh9LGaeq46mqDUwE/h3YrVOcvwLnR8TdzRCjjFNVG8iqrZWxrgT2An4A/L2jPCK+l6j+StrAZhEx6G8UM1uO7KJ8d+D3CeP8HhjWRfkOFA0pSYxuypUqRlXHUvHxVNUGlgBHdVF+NHBvs8Soug3k1NbKOv9fF7dvJay/kjbQcWuWL2NF8dGmsxfp+Rv4bfUisA/wx07le5ePpbBe0pSIWNipfDKwPlEMqOZYoLrjqaoN7BwRv+tcGBF3pfqir6IYUF0byK2tERHvTVlfF6pqA0DzjLq5HLhb0m1sWdBkP4qP7Z9KGOcjwB2SVnaKcyBwYaIY5wBfl7QLWz5+jgWeKR9LpYpjgeqOp6o28GNJN1N0D3TEGUvRPZCqH7iKGFBdG6gqznuBr1XQ1pA0HHgfMAHYvC5X9DK6aBtU1QaAJumjB5C0O3ASxRcXonihfxIRf0kc52UUSyHWxlkUEZsSx9mrNkZE/Cll/WWMSo6ljFXF8VTVBt7Mli8WO+IsiDov+Kk6RhmnqvacW1v7DvAg8E7g34B3Acsj7fcAlbQBaKJEXxVJ+wHPRMRfJbUCkyhe4CTTLJfz82+Ijs5F6Y3Aa4H7I/3IgZcBRMSLZdxDgUci4s8p45SxhkXEhk5loyOiYSvzSJoWEV7noBeSxlB80bcRWBURz1UQc4/U7UzS4RGxNGWdPcS6JyJeI2lpRBwuaRjFSUWP13IMVk0/142kZJe8S7oY+AVwl6T3U3yEejNwYx8vv+6LRcDIMt6/UHRJ7AR8VNL/TRQDSW8HVgOPSzoV+BXwOWCppLcljPNGSe3AE5JuK98cO9yWMM5pnW/A7Jr7qeLsJukKScslrS1vy8uykc0So4wzXtJ/AXcCvwP+A7hP0pxyqGqqOJ/oFPP3QJukRyQdlSoOcI+khyR9StL4hPV2peOk5a+SDqUYHdOaqvKq2sBmqb/dbcQNOK2b2/8A1iSMcz9F0h0FPAuMKct3BpYlirGs5v5iYKfy/lBgacJjuYdieNg4ij7MV5Xl+wOLE8ZZBEwo758OrASO7tiHhHE2Aj8CvsWWURDPkn40xE+AjwF71ZTtBVwM3N4sMco676p53acAc8v7HwC+mzDO3TX3bwbeXBPztwnj3EPxqfRy4CHg3vJv1poqRk2s91OM6Ho98AfgKeC8ZmpnW8VLXWEjbhTvrnPoesjTswnjLC1/Dilf2JfVPJYq0f8WOLS8fyuwe3l/eKoYZX331Nxf1umxuxPGubfT7xMohkK+I3GcycAdwAfZ0uW4qgFtbUV/HhtsMbp5bWoT8gMJ49TWe0+nx1K+2d/d6fcpwBcovsxM9oZS1r0jRf/8JRTXb1wK/J+E9VfSBjpuzTLqZinwuYhY1vkBSf89YZy7JV1HcQZ/BzBX0q3A8cADiWL8E3CtpHsp3kwWS/oFcDjw6UQxgKKPPiJeBM6tKRtCMb45lQ2S9oryC7GIuF/SCRRn33VNYlUrIhZJehPwIeCn5UU6jfiC6Y+S/hfF2e+TsPnqxXPYMjqiGWIAPCzpkxRt+TSKsduU/c0p//cPkLSA4gvFFkkvj4gXyseGJYyz1TDaKIZZLpT0UYoz75TmA+uANmoumEqoqjYANMmXsZKOA/4YEY928dikiFicKM5Q4AyKBPJdijOGdwKPAldFxPOJ4gwBTqS4ym8oW0aPJLuUX9Jk4L6IWN+pvBV4XUSkmGSq4412TUTc26l8N+DCiLg8RZxOde8DfAmYFBEHJK57d4qPz6dSTGgF8CTF4vafiQRfMFYRo4wzkuKMdDxFN8cVEfFs+docEhF3JYrzhk5FbRHxXJm4To+IqxLFeWdEXJeirj7EWhYRhzaw/s5tQMCfSNwGNsdrhkRvZlYlSbOBr0Qm6xs40Zv1QNLrKD7ZLYuIlKOIXk0xfvqu2k+Kkk6OxMNsbdtJeoDigq9VFF03HdMUH56o/qMohm0/I2knirP711J0EX86ItaliLM5nhO92RaSFkbElPL+B4ALgO9TdLX9MNJMbPfhst7lwETgooiYXz52d0T0ZTUtayBJ+3dVHhGdp3nob/33A0dExMby08MLFN3FJ5TlyYYMQ/NMgWBWldovD2cBb4qINZI+RzFcMcVMjB8Ajiz7sVuB70pqjYgrSTtvj/VTqoTeg5dFxMby/qSaN/dfS1qSPFjqCqsk6dTEF2R0F+fTkj4maVQzx8g0zvmSppdfpKfwMkm7l/utiFgDUHavbOz5qX02JMqrUyPiEWAq8GZJX6CCRN+Av9lAx6mkrSW2TFLHxGn3SpoEIOlgtlyslUxTJ3rgKOATkn7c4DgLKf7Jv9jkMXKMI+B1QJJ5wimugGyjuJhtDxXzqiBpBOmS8J9UzEcOQJn03wqMBg5LFKMnqf9mAx2nqraW0vuBN0h6mGJU1J2S/gB8s3wsKffRm/WBpJcDr4iIVQnqagE2RheTcUk6NiJ+U28Maw4qZuI8gHKYdWzPK0z1RNKbIuL2RHUNpZia9B0U82tvXqoMuDo6Tdo1WGPkGKeMVckycjmp6m9WRZwq21puckj0j0bEfonqup5iKa+5bL1U2Uxgj4iY3gwxMo1T2TJyuajqb1ZhnEraWo6aItGXl1d3+RBwfEQkWZFF0oqIeFU3j/0+Ig5uhhiZxumyLkmiWGLuoBRxclLV36zCOJW0tRw1y5exxwHfAD7fxS3l3Np/kXSGynncoZgvRtJ0INXiFlXEyDHOeklTuihPvoxcRqr6m1UVp6q2lp1mGUd/F/BCRPyi8wOSViSMMwP4DMVyZR0NZyTws/KxRsQQxUiPlDG6igPpj6XKOOdQzZKFXVIxr/sGijmPftQkMc6hmr9ZVXG6+t8ZCfyUtG1twDSqnTVF181AqBlH3cgVkhoeI7c4qmAZuW7i7kOx2PXRqSbpqipGVX+zKl+bqtp01RrWBpoh0UtS9LKjfdmmD3GmUcwi2YhpSbuLOQ54DcX84A8mrntXisVTHu5UXsmSbClHRJX17QUQEX9SsTzeccCDEZFqCunO8fYowqVdk7bKGJJeDzwZEStUzNtzNEVbS74uaae4n46ISxLXuR/wVESsL/v/z2HL/DDfrLnStKlU0s6aJNH/HLiJYrjWozXlO1BckDET+FlEzKkzzt+A54EfA9dTJP3Uiyj/ICLeXt4/lWK63Z8Dx1JMZjQnUZwzy7qforis/5yIWFQ+Vsl8KolHRJ1HMfGTKD6+n0OxItixwGcj4upEcfYDPksx58hfy3i7UnQPXFxeyTroY5RxvkQxIdtQihWNTqBo22+gWBDkXxLF+XIXxWdTjMIhIj6cKM4yYEpEvCDpMxTrHfyAYr0IIuLcnp4/mFTVBjaLxCuZNOJGsfrS+cBvKMbNPkCxvNcfKa4km5gozj0Uy4d9gGKxhieBfwfekPBYald++i0wrrw/mk4rAtUZZwmwd3l/CsWK9qd13ocEcRZ0c/sh8HzCOPcBL6dY5vE5yiXYytdrScI4dwLTKaYp6CgbQtEHfFezxCjrvJ8igbyc4svKl5flw0i7mlk78G2K5D6zvK3puJ8wzgM199vYegW4ZP87VdyqagOb6x7oA+7HH2gYRR/WyAbU3Xmpsr2AD5cvymOpYwALOz2WMgHf1+n3vct/jg93Ps464/wFeAvFWWLtbSpFl0Hy16bzP3Xiv9vK/jw22GKUdS0rfw4vX6eO9YmHkHYpwV0oPj1eB+xblv0hVf01cX5CMZwaik/4+5f3RzVhoq+kDXTcmmXUzWZRXP22ukHVd16q7E/Al4Evq5tpS/vhCEnPlLF2VLkMX9kNNSRRDIBnJb0yyv75iFgtaSrFR90JCeNUNSLqRUnDytf/LTUxhpN2mHCbpK9RXJTTsaTbWIqz03uaKAbAzZJ+RZHo/wO4UdJdFG/Ev0wVJCKeBT4i6Ujg25JupjFDt98PXCPpMopl/pZI6vgU/s8NiNdIVbUBoEn66KsiaWpE/HyAYo+kWN7tzkT1HUGRgFd2Kh8GnBkR16aIU5WyT/OJ6PSFm6R9Kf5u/5Uozg4Ul9mfSs0IEoruqKsjwRf1VcSoiXUMxRd9d0l6JcX0AY8C341iPeGkyi9JzweOiYh3p66/jHEIWy/DuagRx9JIVbYBcKLfShWjeyocQeQ427ncXhu3gf5rlitjq/IzSR8qzx43k7SDpOMlzaX4aDXYYzhOP0kaKuk8ST+WtFTSveX9fyo/DdWtihilrF6bCuM0XIVtoIjnN78tyv7ec4F3AeMohj0Np+g7v43iarW6Vn+pIobj1BUnp4ntuvqb7URxgtfo16aqOMnbQBWqagOb4znRd618Vx0N/C0i/tqsMRxnm+vOZmK7TvU2/WszEHEapeo24K6bbkTEhohY3chGVEUMx9lmOU1st1kmr03lcRqo0jbgRG+2tRnA6cCTkn4v6ffAn4DTSDuxXaNj2ODWuQ2spIFtwF03Zt1QRhPb2eBVRRvwGb1ZJ5J2VXGx2drafz5Jh6eO1UWMN6WOYYNbRKwFdpF0mqRXNyKGE71ZDRWTwT0I3CTpfkmTax6eU8EuJJmczQY3ST+ouX8qxWRmbwMWSDondbymmwLBrMEuAY7/W+TlAAACFklEQVQsp4yYAvynpEsi4nt0miKjv9Tz0pijUsSwQa92SpWPUczhs0rSaIoJFeekDOZEb7a1IRGxGiAiFkp6I/AjSS1Aqi+0jgPezUuXwRTFTKOWv9q2NDQiVgFExNOSkk/n4ERvtrUqJoOraiI4G7yqmtwQ8Kgbs60os8ngrLko8eSGm+t1ojfbQspnYjsbvKpuAx51Y7a1nCa2s8Gr0jbgM3qzGjlNbGeDV9VtwInerBs5TWxng1cl7cyJ3swsb+6jNzPLnBO9mVnmfMGUbbckbQLuqyl6e0Q8MkC7Y9Yw7qO37Zak5yJiRD+eNyQiNjVin8wawV03ZjUktUr6laS7y9t/K8unSvqZpOsoPwVIerekhZKWSPqGpOSXrpul4K4b257tJKljrPKqiHgH8BTwpohYL+kg4HpgUrnNFODQcpbBQ4DpwLERsUHS1yjGRF9T8TGY9cqJ3rZnf4uIiZ3KhgFflTQR2ATULtK8sGOWQeAE4EhgkSSAnSjeJMwGHSd6s639T+BJ4AiKrs31NY89X3NfwNyI+HiF+2bWL+6jN9vabsDqiHgReA/dTxl7B3C6pD0BJO0haf9utjUbUE70Zlv7GjBT0l0U3TbPd7VRRDwAfAK4TdJS4HZg78r20mwbeHilmVnmfEZvZpY5J3ozs8w50ZuZZc6J3swsc070ZmaZc6I3M8ucE72ZWeac6M3MMvf/AYZsjktoNR9QAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "fare = pd.crosstab(temp.Fare,temp.Survived)\n",
    "fare.plot.bar()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "结论：票价越高的游客，相对于票价低的游客，存活率较高"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 探索性分析\n",
    "\n",
    "特征标准化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.383838</td>\n",
       "      <td>2.308642</td>\n",
       "      <td>0.647587</td>\n",
       "      <td>29.345679</td>\n",
       "      <td>32.204208</td>\n",
       "      <td>1.904602</td>\n",
       "      <td>0.602694</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>257.353842</td>\n",
       "      <td>0.486592</td>\n",
       "      <td>0.836071</td>\n",
       "      <td>0.477990</td>\n",
       "      <td>13.028212</td>\n",
       "      <td>49.693429</td>\n",
       "      <td>1.613459</td>\n",
       "      <td>0.489615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>223.500000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>22.000000</td>\n",
       "      <td>7.910400</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>28.000000</td>\n",
       "      <td>14.454200</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>668.500000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>35.000000</td>\n",
       "      <td>31.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>80.000000</td>\n",
       "      <td>512.329200</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       PassengerId    Survived      Pclass         Sex         Age  \\\n",
       "count   891.000000  891.000000  891.000000  891.000000  891.000000   \n",
       "mean    446.000000    0.383838    2.308642    0.647587   29.345679   \n",
       "std     257.353842    0.486592    0.836071    0.477990   13.028212   \n",
       "min       1.000000    0.000000    1.000000    0.000000    0.000000   \n",
       "25%     223.500000    0.000000    2.000000    0.000000   22.000000   \n",
       "50%     446.000000    0.000000    3.000000    1.000000   28.000000   \n",
       "75%     668.500000    1.000000    3.000000    1.000000   35.000000   \n",
       "max     891.000000    1.000000    3.000000    1.000000   80.000000   \n",
       "\n",
       "             Fare  familysize     isalone  \n",
       "count  891.000000  891.000000  891.000000  \n",
       "mean    32.204208    1.904602    0.602694  \n",
       "std     49.693429    1.613459    0.489615  \n",
       "min      0.000000    1.000000    0.000000  \n",
       "25%      7.910400    1.000000    0.000000  \n",
       "50%     14.454200    1.000000    1.000000  \n",
       "75%     31.000000    2.000000    1.000000  \n",
       "max    512.329200   11.000000    1.000000  "
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Age和Fare两列数据需要标准化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 418 entries, 0 to 417\n",
      "Data columns (total 7 columns):\n",
      "PassengerId    418 non-null int64\n",
      "Pclass         418 non-null int64\n",
      "Sex            418 non-null int32\n",
      "Age            418 non-null int64\n",
      "Fare           418 non-null float64\n",
      "familysize     418 non-null int64\n",
      "isalone        418 non-null int64\n",
      "dtypes: float64(1), int32(1), int64(5)\n",
      "memory usage: 21.3 KB\n"
     ]
    }
   ],
   "source": [
    "test2.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\utils\\validation.py:475: DataConversionWarning: Data with input dtype int64 was converted to float64 by the scale function.\n",
      "  warnings.warn(msg, DataConversionWarning)\n"
     ]
    }
   ],
   "source": [
    "#数据标准化\n",
    "from sklearn import preprocessing\n",
    "train2.Age = preprocessing.scale(train2.Age.values)\n",
    "test2.Age = preprocessing.scale(test2.Age.values)\n",
    "train2.Fare = preprocessing.scale(train2.Fare.values)\n",
    "test2.Fare = preprocessing.scale(test2.Fare.values)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>8.910000e+02</td>\n",
       "      <td>8.910000e+02</td>\n",
       "      <td>891.000000</td>\n",
       "      <td>891.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.383838</td>\n",
       "      <td>2.308642</td>\n",
       "      <td>0.647587</td>\n",
       "      <td>-2.251909e-16</td>\n",
       "      <td>-4.373606e-17</td>\n",
       "      <td>1.904602</td>\n",
       "      <td>0.602694</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>257.353842</td>\n",
       "      <td>0.486592</td>\n",
       "      <td>0.836071</td>\n",
       "      <td>0.477990</td>\n",
       "      <td>1.000562e+00</td>\n",
       "      <td>1.000562e+00</td>\n",
       "      <td>1.613459</td>\n",
       "      <td>0.489615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>-2.253737e+00</td>\n",
       "      <td>-6.484217e-01</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>223.500000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>-5.641453e-01</td>\n",
       "      <td>-4.891482e-01</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>446.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>-1.033476e-01</td>\n",
       "      <td>-3.573909e-01</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>668.500000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>4.342497e-01</td>\n",
       "      <td>-2.424635e-02</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>891.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.890232e+00</td>\n",
       "      <td>9.667167e+00</td>\n",
       "      <td>11.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       PassengerId    Survived      Pclass         Sex           Age  \\\n",
       "count   891.000000  891.000000  891.000000  891.000000  8.910000e+02   \n",
       "mean    446.000000    0.383838    2.308642    0.647587 -2.251909e-16   \n",
       "std     257.353842    0.486592    0.836071    0.477990  1.000562e+00   \n",
       "min       1.000000    0.000000    1.000000    0.000000 -2.253737e+00   \n",
       "25%     223.500000    0.000000    2.000000    0.000000 -5.641453e-01   \n",
       "50%     446.000000    0.000000    3.000000    1.000000 -1.033476e-01   \n",
       "75%     668.500000    1.000000    3.000000    1.000000  4.342497e-01   \n",
       "max     891.000000    1.000000    3.000000    1.000000  3.890232e+00   \n",
       "\n",
       "               Fare  familysize     isalone  \n",
       "count  8.910000e+02  891.000000  891.000000  \n",
       "mean  -4.373606e-17    1.904602    0.602694  \n",
       "std    1.000562e+00    1.613459    0.489615  \n",
       "min   -6.484217e-01    1.000000    0.000000  \n",
       "25%   -4.891482e-01    1.000000    0.000000  \n",
       "50%   -3.573909e-01    1.000000    1.000000  \n",
       "75%   -2.424635e-02    2.000000    1.000000  \n",
       "max    9.667167e+00   11.000000    1.000000  "
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "机器学习预测"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>-0.564145</td>\n",
       "      <td>-0.502445</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0.664649</td>\n",
       "      <td>0.786845</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>-0.256947</td>\n",
       "      <td>-0.488854</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0.434250</td>\n",
       "      <td>0.420730</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>0.434250</td>\n",
       "      <td>-0.486337</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  Sex       Age      Fare  familysize  isalone\n",
       "0            1         0       3    1 -0.564145 -0.502445           2        0\n",
       "1            2         1       1    0  0.664649  0.786845           2        0\n",
       "2            3         1       3    0 -0.256947 -0.488854           1        1\n",
       "3            4         1       1    0  0.434250  0.420730           2        0\n",
       "4            5         0       3    1  0.434250 -0.486337           1        1"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 3.        ,  1.        , -0.56414531, -0.50244517,  2.        ,\n",
       "         0.        ],\n",
       "       [ 1.        ,  0.        ,  0.66464851,  0.78684529,  2.        ,\n",
       "         0.        ],\n",
       "       [ 3.        ,  0.        , -0.25694686, -0.48885426,  1.        ,\n",
       "         1.        ],\n",
       "       ...,\n",
       "       [ 3.        ,  0.        , -0.10334763, -0.17626324,  4.        ,\n",
       "         0.        ],\n",
       "       [ 1.        ,  1.        , -0.25694686, -0.04438104,  1.        ,\n",
       "         1.        ],\n",
       "       [ 3.        ,  1.        ,  0.20385083, -0.49237783,  1.        ,\n",
       "         1.        ]])"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#特征x \n",
    "x = train2.drop(['PassengerId','Survived'],axis=1).values\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1,\n",
       "       1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,\n",
       "       1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1,\n",
       "       1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0,\n",
       "       1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,\n",
       "       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0,\n",
       "       0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,\n",
       "       0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0,\n",
       "       0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0,\n",
       "       1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0,\n",
       "       1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1,\n",
       "       0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0,\n",
       "       0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,\n",
       "       1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1,\n",
       "       0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1,\n",
       "       1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0,\n",
       "       0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,\n",
       "       0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0,\n",
       "       0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1,\n",
       "       0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0,\n",
       "       1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0,\n",
       "       0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1,\n",
       "       1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,\n",
       "       1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0,\n",
       "       0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1,\n",
       "       1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,\n",
       "       1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0,\n",
       "       0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1,\n",
       "       0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0,\n",
       "       0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0,\n",
       "       1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1,\n",
       "       0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0,\n",
       "       0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0,\n",
       "       1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1,\n",
       "       0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0,\n",
       "       0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,\n",
       "       0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,\n",
       "       0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1,\n",
       "       0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1,\n",
       "       1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1,\n",
       "       1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0], dtype=int64)"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#标签y\n",
    "y = train2.Survived.values\n",
    "y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
       "          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,\n",
       "          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,\n",
       "          verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#建立训练集和测试集\n",
    "from sklearn.model_selection import train_test_split\n",
    "x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2)\n",
    "\n",
    "#建立分类器\n",
    "#整个预测流程里面，只有这个cell的内容替换成不同的分类器算法即可\n",
    "#上面下面cell的代码都不要改变，因为sklearn内的分类器api借口非常相似\n",
    "#注意： 仅适用于分类器（分类和回归，聚类算法训练函数不同，且输入值只有x，没有y）\n",
    "\n",
    "#######\n",
    "#决策树\n",
    "#from sklearn import tree\n",
    "#my_classifier = tree.DecisionTreeClassifier()\n",
    "\n",
    "#逻辑回归\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "my_classifier = LogisticRegression()\n",
    "\n",
    "#KNN\n",
    "#from sklearn.neighbors import KNeighborsClassifier\n",
    "#my_classifier = KNeighborsClassifier()\n",
    "\n",
    "\n",
    "#用训练集训练分类器（聚类算法函数不同，且输入值只需要输入x）\n",
    "my_classifier.fit(x_train,y_train)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1,\n",
       "       0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,\n",
       "       0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1,\n",
       "       0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,\n",
       "       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1,\n",
       "       1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0,\n",
       "       1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,\n",
       "       0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1,\n",
       "       0, 0, 1], dtype=int64)"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#测试集调用预测方法，分类数据\n",
    "predictions = my_classifier.predict(x_test)\n",
    "predictions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8212290502793296"
      ]
     },
     "execution_count": 100,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#比较测试数据预测结果和真是结果，算出正确率\n",
    "from sklearn.metrics import accuracy_score\n",
    "accuracy_score(y_test,predictions)\n",
    "#用决策树算法正确率为0.78\n",
    "#用逻辑回归正确率为0.81\n",
    "#用KNN算法正确率为0.77\n",
    "\n",
    "#所以选用逻辑回归"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 内部测试后，用test2数据实际预测"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "      <th>familysize</th>\n",
       "      <th>isalone</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>0.333051</td>\n",
       "      <td>-0.497063</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>1.359016</td>\n",
       "      <td>-0.511926</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>2.542820</td>\n",
       "      <td>-0.463754</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>-0.219391</td>\n",
       "      <td>-0.482127</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>-0.613993</td>\n",
       "      <td>-0.417151</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Pclass  Sex       Age      Fare  familysize  isalone\n",
       "0          892       3    1  0.333051 -0.497063           1        1\n",
       "1          893       3    0  1.359016 -0.511926           2        0\n",
       "2          894       2    1  2.542820 -0.463754           1        1\n",
       "3          895       3    1 -0.219391 -0.482127           1        1\n",
       "4          896       3    0 -0.613993 -0.417151           3        0"
      ]
     },
     "execution_count": 101,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 3.        ,  1.        ,  0.33305132, -0.49706313,  1.        ,\n",
       "         1.        ],\n",
       "       [ 3.        ,  0.        ,  1.35901552, -0.51192594,  2.        ,\n",
       "         0.        ],\n",
       "       [ 2.        ,  1.        ,  2.54282037, -0.46375447,  1.        ,\n",
       "         1.        ],\n",
       "       ...,\n",
       "       [ 3.        ,  1.        ,  0.64873261, -0.50744487,  1.        ,\n",
       "         1.        ],\n",
       "       [ 3.        ,  1.        , -0.14047062, -0.49310546,  1.        ,\n",
       "         1.        ],\n",
       "       [ 3.        ,  1.        , -0.14047062, -0.23663968,  3.        ,\n",
       "         0.        ]])"
      ]
     },
     "execution_count": 102,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 预测特征 X2\n",
    "x2 = test2.drop('PassengerId',axis=1).values\n",
    "x2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0,\n",
       "       1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,\n",
       "       1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,\n",
       "       1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,\n",
       "       1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,\n",
       "       0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0,\n",
       "       0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,\n",
       "       0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,\n",
       "       1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,\n",
       "       0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,\n",
       "       1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,\n",
       "       0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1,\n",
       "       0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0,\n",
       "       0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,\n",
       "       0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,\n",
       "       1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0,\n",
       "       0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,\n",
       "       1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1,\n",
       "       0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0],\n",
       "      dtype=int64)"
      ]
     },
     "execution_count": 103,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prediction2 = my_classifier.predict(x2)\n",
    "prediction2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "将预测值和测试数据索引构造表，输出csv上传，查看成绩"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>897</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>898</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>899</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>900</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>901</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>902</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>903</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>904</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>905</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>906</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>907</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>908</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>909</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>910</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>911</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>912</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>913</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>914</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>915</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>916</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>917</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>918</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>919</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>920</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>921</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>388</th>\n",
       "      <td>1280</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>389</th>\n",
       "      <td>1281</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>390</th>\n",
       "      <td>1282</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>391</th>\n",
       "      <td>1283</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>392</th>\n",
       "      <td>1284</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>393</th>\n",
       "      <td>1285</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>394</th>\n",
       "      <td>1286</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>395</th>\n",
       "      <td>1287</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>396</th>\n",
       "      <td>1288</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>397</th>\n",
       "      <td>1289</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>398</th>\n",
       "      <td>1290</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>399</th>\n",
       "      <td>1291</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>400</th>\n",
       "      <td>1292</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>401</th>\n",
       "      <td>1293</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>402</th>\n",
       "      <td>1294</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>403</th>\n",
       "      <td>1295</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>404</th>\n",
       "      <td>1296</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>405</th>\n",
       "      <td>1297</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>406</th>\n",
       "      <td>1298</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>407</th>\n",
       "      <td>1299</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>408</th>\n",
       "      <td>1300</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>409</th>\n",
       "      <td>1301</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>410</th>\n",
       "      <td>1302</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>411</th>\n",
       "      <td>1303</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>412</th>\n",
       "      <td>1304</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>413</th>\n",
       "      <td>1305</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>414</th>\n",
       "      <td>1306</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>415</th>\n",
       "      <td>1307</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>416</th>\n",
       "      <td>1308</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>417</th>\n",
       "      <td>1309</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>418 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived\n",
       "0            892         0\n",
       "1            893         0\n",
       "2            894         0\n",
       "3            895         0\n",
       "4            896         1\n",
       "5            897         0\n",
       "6            898         1\n",
       "7            899         0\n",
       "8            900         1\n",
       "9            901         0\n",
       "10           902         0\n",
       "11           903         0\n",
       "12           904         1\n",
       "13           905         0\n",
       "14           906         1\n",
       "15           907         1\n",
       "16           908         0\n",
       "17           909         0\n",
       "18           910         1\n",
       "19           911         0\n",
       "20           912         0\n",
       "21           913         0\n",
       "22           914         1\n",
       "23           915         1\n",
       "24           916         1\n",
       "25           917         0\n",
       "26           918         1\n",
       "27           919         0\n",
       "28           920         0\n",
       "29           921         0\n",
       "..           ...       ...\n",
       "388         1280         0\n",
       "389         1281         0\n",
       "390         1282         1\n",
       "391         1283         1\n",
       "392         1284         0\n",
       "393         1285         0\n",
       "394         1286         0\n",
       "395         1287         1\n",
       "396         1288         0\n",
       "397         1289         1\n",
       "398         1290         0\n",
       "399         1291         0\n",
       "400         1292         1\n",
       "401         1293         0\n",
       "402         1294         1\n",
       "403         1295         1\n",
       "404         1296         0\n",
       "405         1297         0\n",
       "406         1298         0\n",
       "407         1299         0\n",
       "408         1300         1\n",
       "409         1301         1\n",
       "410         1302         1\n",
       "411         1303         1\n",
       "412         1304         1\n",
       "413         1305         0\n",
       "414         1306         1\n",
       "415         1307         0\n",
       "416         1308         0\n",
       "417         1309         0\n",
       "\n",
       "[418 rows x 2 columns]"
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t = pd.DataFrame({'PassengerId': test2.PassengerId, 'Survived': prediction2})\n",
    "t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [],
   "source": [
    "t.to_csv('t_LogisticRegression.csv',index=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>892</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>893</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>894</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>895</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>896</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived\n",
       "0          892         0\n",
       "1          893         1\n",
       "2          894         0\n",
       "3          895         0\n",
       "4          896         1"
      ]
     },
     "execution_count": 109,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "zhenshi = pd.read_csv('gender_submission.csv')\n",
    "zhenshi.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9473684210526315"
      ]
     },
     "execution_count": 118,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "zqv = (zhenshi['Survived'].values == t['Survived'].values).sum() / zhenshi.shape[0]\n",
    "zqv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 结论：\n",
    "预测生还人数的正确率达到94.7%"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
